CE/Lecturer: Dr Trung Le | trunglm@monash.edu
Head Tutor: Mr Tuan Nguyen | tuan.Ng@monash.edu
Department of Data Science and AI, Faculty of Information Technology, Monash University, Australia
Surname: [Yee]
Firstname: [Darren Jer Shien]
Student ID: [31237223]
Email: [dyee0005@student.monash.edu]
Your tutorial time: [10AM Monday]
This notebook has been prepared for you to complete Assignment 1. The theme of this assignment is about practical knowledge and skills in deep neural networks, including feedforward and convolutional neural networks. Some sections have been partially completed to help you get started. The total marks for this notebook is 100.
Before getting started, you should read the entire notebook carefully once to understand what you need to do.
For each cell marked with #YOU ARE REQUIRED TO INSERT YOUR CODES IN THIS CELL, there will be places where you must supply your own codes when instructed.
This assignment contains three parts:
Hint: This assignment was essentially designed based on the lectures and tutorials sessions covered from Week 1 to Week 5. You are strongly recommended to go through these contents thoroughly which might help you to complete this assignment.
This assignment is to be completed individually and submitted to Moodle unit site. By the due date, you are required to submit one single zip file, named xxx_assignment01_solution.zip where xxx is your student ID, to the corresponding Assignment (Dropbox) in Moodle.
For example, if your student ID is 12356, then gather all of your assignment solution to folder, create a zip file named 123456_assignment01_solution.zip and submit this file.
Within this zip folder, you must submit the following files:
Since the notebook is quite big to load and work together, one recommended option is to split solution into three parts and work on them seperately. In that case, replace Assignment01_solution.ipynb by three notebooks: Assignment01_Part1_solution.ipynb, Assignment01_Part2_solution.ipynb and Assignment01_Part3_solution.ipynb
You can run your codes on Google Colab. In this case, you have to make a copy of your Google colab notebook including the traces and progresses of model training before submitting.
You also need to store your trained models to folder *./models* with recognizable file names (e.g., Part3_Sec3_2_model.h5).
The first part of this assignment is to demonstrate your knowledge in deep learning that you have acquired from the lectures and tutorials materials. Most of the contents in this assignment are drawn from the lectures and tutorials from weeks 1 to 3. Going through these materials before attempting this part is highly recommended.
**(a)** Exponential linear unit (ELU): $\text{ELU}(x)=\begin{cases} 0.1\left(\exp(x)-1\right) & \text{if}\,x\leq0\\ x & \text{if}\,x>0 \end{cases}$
**(b)** Gaussian Error Linear Unit (GELU): $\text{GELU}(x)=x\Phi(x)$ where $\Phi(x)$ is the probability cummulative function of the standard Gaussian distribution or $\Phi(x) = \mathbb{P}\left(X\leq x\right)$ where $X\sim\mathcal{N}\left(0,1\right)$. In addition, the GELU activation fuction (the link for the main paper) has been currently widely used in the state-of-the-art Vision for Transformers (e.g., here is the link for the main ViT paper).
Your ansewers here
**Question 1.1**
Derrivative of Exponential linear unit (ELU): $\text{ELU}(x)=\begin{cases}
0.1\left(\exp(x)\right) & \text{if}\,x\leq0\\
1 & \text{if}\,x>0
\end{cases}$
Steps:
x <= 0
= 1/10 (d/dx[exp(x)] + d/dx[-1])
= (exp(x) +0) /10
= 0.1 (exp(x))
x > 0
= d/dx(x)
= 1
output range = [-0.1,infinity]
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
# Elu code
def elu (x, alpha):
return np.where(x > 0,x, alpha*(np.exp(x)-1))
def elu_derivative(x, alpha):
return np.where(x > 0, 1, alpha*np.exp(x))
# Plotting code
x = np.linspace(-5, 5, 400)
alpha = 0.1
elu_values = elu(x, alpha)
elu_derivative_values = elu_derivative(x, alpha)
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 2)
plt.plot(x, elu_values, label="ELU")
plt.plot(x, elu_derivative_values, label="ELU's Derivative")
plt.title("ELU and Derivative Values")
plt.xlabel('x')
plt.ylabel("f'(x)")
plt.legend()
plt.tight_layout()
plt.show()
Formula of GELU =
GELU(x)=xP(X≤x)=xΦ(x)
which can be approximated to
$0.5x(1+tanh[√(2/π)−(x+0.044715x^3)])$
Derivative of gelu
= d/dx ($0.5x(1+tanh[√(2/π)−(x+0.044715x^3)])$)
output range = [-infinity,inifinty]
#Gelu code
def gelu(x):
cdf = 0.5 * (1.0 + tf.math.erf(x / tf.sqrt(2.0)))
return x * cdf
def gelu_derivative(x):
with tf.GradientTape() as tape:
tape.watch(x)
y = gelu(x)
return tape.gradient(y, x)
#Plotting code
x = np.linspace(-5, 5, 400)
x_tf = tf.constant(x, dtype=tf.float32)
gelu_values = gelu(x_tf)
gelu_derivative_values = gelu_derivative(x_tf)
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 2)
plt.plot(x, gelu_values, label="GELU")
plt.plot(x, gelu_derivative_values, label="GELU's Derivative")
plt.title("GELU and Derivative Values")
plt.xlabel('x')
plt.ylabel("f'(x)")
plt.legend()
plt.tight_layout()
plt.show()
**Numpy is possibly being used in the following questions. You need to import numpy here.**
import numpy as np
ReLU activation function as shown in the following figure¶
**(a)** What is the numerical value of the latent presentation $h^1(x)$?
**(b)** What is the numerical value of the latent presentation $h^2(x)$?
**(c)** What is the numerical value of the logit $h^3(x)$?
**(d)** What is the corresonding prediction probabilities $p(x)$?
**(e)** What is the predicted label $\hat{y}$? Is it a correct and an incorect prediction? Remind that $y=3$.
**(f)** What is the cross-entropy loss caused by the feed-forward neural network at $(x,y)$? Remind that $y=3$.
**(g)** Why the cross-entropy loss caused by the feed-forward neural network at $(x,y)$ (i.e., $\text{CE}(1_y, p(x))$) is always non-negative? When does this $\text{CE}(1_y, p(x))$ loss get the value $0$? Note that you need to answer this question for a general pair $(x,y)$ and a general feed-forward neural network with for example $M=4$ classes?
You need to show both formulas and numerical results for earning full mark. Although it is optional, it is great if you show your numpy code for your computation.
# Answers for Questions
x = np.matrix([[1],[-1],[-2]])
w1 = np.matrix([[1,-1,2],[-1,0.5,1],[-2,1,2],[0,0,1]])
w2 = np.matrix([[-1,1,0,1],[1,1,0,-2],[0.5,-1,2,0]])
w3 = np.matrix ([[1,-2,0],[0,2,0],[1,-1,1],[0.5,1,2]])
b1 = np.matrix ([[1],[0],[1],[0]])
b2 = np.matrix ([[0],[0.5],[1]])
b3 = np.matrix ([[-1],[1],[-1],[1]])
# h1
a1 = np.maximum(np.dot(w1, x) + b1, 0)
#h2
a2 = np.maximum(np.dot(w2, a1) + b2, 0)
#h3 logit
a3 = np.dot(w3, a2) + b3
print ("a)\n")
print (a1,"\n")
print ("b)\n")
print (a2,"\n")
print ("c)\n")
print (a3,"\n")
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
#prediction probability
prediction_probability = softmax(a3)
print ("d)\n")
print (prediction_probability,"\n")
print ("e)\n")
#prediction label
predicted_label = np.argmax(prediction_probability)
print ("Predicted label = ",predicted_label + 1,"\n")
print ("f)\n")
one_hot = np.zeros(prediction_probability.shape,dtype=int)
one_hot[predicted_label] = 1
target_index = 3
#ce loss
ce_loss = -np.log(prediction_probability[target_index-1])
print ("Cross entropy loss =",ce_loss[0,0],"\n")
print ("g)\n")
a) [[0.] [0.] [0.] [0.]] b) [[0. ] [0.5] [1. ]] c) [[-2. ] [ 2. ] [-0.5] [ 3.5]] d) [[0.00328114] [0.17914438] [0.01470507] [0.80286941]] e) Predicted label = 4 f) Cross entropy loss = 4.219563205900562 g)
For a general (x,y) pair and a general Neural Network with M = 4 classes, we can use it to prove why CE loss is always non-nagative. The calculation of CE loss involves us using the one-hot vector (all values are 0 except for the true class) and taking the negative log value between it and the original predicted probabilities using Softmax.
For example: the one-hot vector could be [0,1,0,0] and the predicted probabilities could be [0.2,0.4,0.2,0.2].
We calcuate CE loss by taking the summation of the negative log of each value in the one-hot vector and their corresponding probabilities
(-np.sum(one_hot np.log(predicted))*
Because of that, it means that we can ignore all the other classes that are not true since their 0 value in the one-hot vector would mean that their values will not be included in the loss anyways. Log of any value between 0 and 1 (inclusive) is always negative and since our one_hot value for our true class is always one, we can deduce that the values (one_hot * np.log(predicted)) will always be negative. Then, by taking the negative sign outside into account, we can prove that the value will always be non-negative due to double negation.
For Question 1.3, you have two options: (i) do forward, backward propagation, and SGD update for one data example (15 points) and (ii) do forward, backward propagation, and SGD update for a batch of data examples (20 points). You can choose either (i) or (ii) to proceed.

We feed a data example $x$ with the label $y$ as shown in the figure. Answer the following questions.
You need to show both formulas, numerical results, and your numpy code for your computation for earning full marks.
#Code to generate random matrices and biases for W1, b1, W2, b2
import numpy as np
student_id = 31237223 #insert your student id here for example 1234
np.random.seed(student_id)
W1 = np.random.rand(5,3)
b1 = np.random.rand(5,1)
W2 = np.random.rand(3,5)
b2 = np.random.rand(3,1)
Forward propagation
**(a)** What is the value of $\bar{h}^{1}(x)$?
Show your fomular
**(b)** What is the value of $h^{1}(x)$?
Show your fomular
**(c)** What is the predicted value $\hat{y}$?
Show your fomular
**(d)** Suppose that we use the cross-entropy (CE) loss. What is the value of the CE loss $l$?
Show your fomular
Backward propagation
**(e)** What are the derivatives $\frac{\partial l}{\partial h^{2}},\frac{\partial l}{\partial W^{2}}$, and $\frac{\partial l}{\partial b^{2}}$?
Show your fomular
#Show your code
**(f)** What are the derivatives $\frac{\partial l}{\partial h^{1}}, \frac{\partial l}{\partial \bar{h}^{1}},\frac{\partial l}{\partial W^{1}}$, and $\frac{\partial l}{\partial b^{1}}$?
Show your fomular
#Show your code
SGD update
**(g)** Assume that we use SGD with learning rate $\eta=0.01$ to update the model parameters. What are the values of $W^2, b^2$ and $W^1, b^1$ after updating?
Show your fomular
#Show your code

We feed a batch $X$ with the labels $Y$ as shown in the figure. Note that $x^{T}$ represents the transpose vector of the vector $x$. Answer the following questions.
You need to show both formulas, numerical results, and your numpy code for your computation for earning full marks.
#Code to generate random matrices and biases for W1, b1, W2, b2
import numpy as np
student_id = 31237223 #insert your student id here for example 1234
np.random.seed(student_id)
W1 = np.random.rand(5,3)
b1 = np.random.rand(5,1)
W2 = np.random.rand(3,5)
b2 = np.random.rand(3,1)
Forward propagation
**(a)** What is the value of $\bar{h}^{1}(x)$?
Show your formula
# Show your code
def elu(x, alpha=0.1):
"""ELU activation function."""
y = np.where(x > 0, x, alpha * (np.exp(x) - 1))
return y
# Given input examples
x1 = np.matrix([ 1, -1, 1]).T
x2 = np.matrix([-1, 2, -1]).T
x3 = np.matrix([-1.5, 1, 0]).T
x4 = np.matrix([-1, 2, -1]).T
x5 = np.matrix([ 0, 2.5, 1.5]).T
# Stack the input examples to create a mini-batch matrix
mini_batch_matrix = np.hstack((x1, x2, x3, x4, x5))
qA = np.dot(W1,mini_batch_matrix) + b1
print ("answer:\n",qA)
answer: [[ 1.01009806 0.30364112 -0.17987178 0.30364112 2.93026733] [ 0.45984992 1.08805968 0.05061028 1.08805968 2.59282736] [ 1.11497155 1.27251212 0.33600097 1.27251212 3.0365595 ] [ 1.27415222 1.2745985 1.22620037 1.2745985 3.40831993] [ 0.64008934 0.37189129 -0.47838965 0.37189129 1.50595585]]
**(b)** What is the value of $h^{1}(x)$?
Show your formula
#Show your code
qB = elu(qA)
print ("answer:\n",qB)
answer: [[ 1.01009806 0.30364112 -0.01646227 0.30364112 2.93026733] [ 0.45984992 1.08805968 0.05061028 1.08805968 2.59282736] [ 1.11497155 1.27251212 0.33600097 1.27251212 3.0365595 ] [ 1.27415222 1.2745985 1.22620037 1.2745985 3.40831993] [ 0.64008934 0.37189129 -0.03802193 0.37189129 1.50595585]]
**(c)** What is the predicted value $\hat{y}$?
Show your formula
#Show your code
# correct solution:
qC = np.dot(W2,qB) + b2
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
# Get softmax values
softmax_qC = softmax(qC)
prediction_qC = np.argmax(softmax(qC),axis=0)
# init one hot array
one_hot = np.zeros(softmax_qC.shape,dtype=int)
prediction_qC = np.expand_dims(prediction_qC, 0)
# create new one hot array based on prediction
np.put_along_axis(one_hot,prediction_qC,1,axis=0)
print ("softmax:\n",softmax_qC)
print ("one hot:\n",one_hot)
print ("prediction value\n",prediction_qC)
softmax: [[0.55341198 0.43520274 0.29568302 0.43520274 0.91412831] [0.23225884 0.20552616 0.25346639 0.20552616 0.03970293] [0.21432918 0.3592711 0.45085058 0.3592711 0.04616876]] one hot: [[1 1 0 1 1] [0 0 0 0 0] [0 0 1 0 0]] prediction value [[0 0 2 0 0]]
**(d)** Suppose that we use the cross-entropy (CE) loss. What is the value of the CE loss $l$?
Show your formula
#Show your code
# original target array = [2,1,3,1,2], scaled to fit array index
one_hot_target = np.zeros(softmax_qC.shape,dtype=int)
target = [1,0,2,0,1]
target = np.expand_dims(target, 0)
np.put_along_axis(one_hot_target,target,1,axis=0)
ce = - np.sum(one_hot_target * np.log(softmax_qC)) / 5
print ("ce loss:\n",ce)
ce loss: 1.4293477795659966
Backward propagation
**(e)** What are the derivatives $\frac{\partial l}{\partial h^{2}},\frac{\partial l}{\partial W^{2}}$, and $\frac{\partial l}{\partial b^{2}}$?
Show your formula
Part 1:
$\frac{\partial l}{\partial h^{2}}$ = $g2$ = $p^T - 1y ∈ ℝ^{1×n^2}$
Part 2:
$\frac{\partial l}{\partial W^{2}}$ = $\frac{\partial l}{\partial h^{2}} . \frac{\partial h^2}{\partial W^{2}}$
$\frac{\partial l}{\partial h^{2}}$ = $(g^2)^T$
$\frac{\partial h^2}{\partial W^{2}}$ = $(h^1)^T$
$\frac{\partial l}{\partial W^{2}}$ = $(g^2)^T . (h^1)^T ∈ ℝ^{n^2×n^1}$
Part 3:
$\frac{\partial l}{\partial b^{2}}$ = $\frac{\partial l}{\partial h^{2}} . \frac{\partial h^2}{\partial b^{2}}$
$\frac{\partial l}{\partial h^{2}}$ = $(g^2)^T$
$\frac{\partial h^2}{\partial b^{2}}$ = 1
$\frac{\partial l}{\partial b^{2}}$ = $(g^2)^T ∈ ℝ^{n^2×1}$
#Part 1:
p1 = softmax_qC - one_hot_target
print ("Part 1:\n",p1)
p2 = np.dot(p1,qB.T)
print ("\nPart 2:\n",p2)
p3 = p1.T
print ("\nPart 3:\n",p3)
matr = np.sum (p1.T,axis = 0)
print (matr.reshape(1, -1).T)
Part 1: [[ 0.55341198 -0.56479726 0.29568302 -0.56479726 0.91412831] [-0.76774116 0.20552616 0.25346639 0.20552616 -0.96029707] [ 0.21432918 0.3592711 -0.54914942 0.3592711 0.04616876]] Part 2: [[ 2.88978173 1.41056169 2.05477067 2.74355998 1.29954118] [-3.46878122 -2.38285275 -3.16377471 -3.41649147 -1.79435842] [ 0.57899949 0.97229106 1.10900404 0.6729315 0.49481725]] Part 3: [[ 0.55341198 -0.76774116 0.21432918] [-0.56479726 0.20552616 0.3592711 ] [ 0.29568302 0.25346639 -0.54914942] [-0.56479726 0.20552616 0.3592711 ] [ 0.91412831 -0.96029707 0.04616876]] [[ 0.63362879] [-1.06351951] [ 0.42989072]]
**(f)** What are the derivatives $\frac{\partial l}{\partial h^{1}}, \frac{\partial l}{\partial \bar{h}^{1}},\frac{\partial l}{\partial W^{1}}$, and $\frac{\partial l}{\partial b^{1}}$?
Show your formula
Part 1:
$\frac{\partial l}{\partial h^{1}}$ = $g1$ = $g2 . W2 ∈ ℝ^{1×n^2}$
Part 2:
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $\frac{\partial l}{\partial {h}^{1}} . \frac{\partial {h}^{1}}{\partial \bar{h}^{1}} $
$\frac{\partial l}{\partial {h}^{1}}$ = $g^1$
$\frac{\partial {h}^{1}}{\partial \bar{h}^{1}} $= $ELU'$($\bar{h}^{1}$)
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $g^1ELU'$($\bar{h}^{1}$)
Part 3:
$\frac{\partial l}{\partial W^{1}}$ = $\frac{\partial l}{\partial \bar{h}^{1}} . \frac{\partial \bar{h}^{1}}{\partial {W}^{1}} $
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $(\bar g^1)$
$\frac{\partial \bar{h}^{1}}{\partial {W}^{1}} $ = $(h^0)^T$
$\frac{\partial l}{\partial W^{1}}$ = $(\bar g^1)(h^0)^T$
Part 4:
$\frac{\partial l}{\partial b^{1}}$ = $\frac{\partial l}{\partial \bar{h}^{1}} . \frac{\partial \bar{h}^{1}}{\partial {b}^{1}} $
$\frac{\partial l}{\partial \bar{h}^{1}}$ = $(\bar g^1)$
$\frac{\partial \bar{h}^{1}}{\partial {b}^{1}}$ = 1
$\frac{\partial l}{\partial b^{1}}$ = $(\bar g^1)^T$
#Part 1
p1_f = np.dot (p1.T,W2)
print (p1_f)
def elu_derivative(x, alpha):
return np.where(x > 0, 1, alpha*np.exp(x))
p2_f = np.dot (p1_f,elu_derivative(qA,0.1))
print (p2_f)
p3_f = np.dot (p2_f,mini_batch_matrix.T)
print (p3_f)
p4_f = p2_f.T
print (p4_f)
[[-0.19335577 0.0518627 0.13949576 0.42682301 0.44642821] [-0.23272077 0.05467927 -0.14588584 -0.09308672 -0.44906295] [ 0.39047495 -0.09584548 0.07857322 -0.16522643 0.23100242] [-0.23272077 0.05467927 -0.14588584 -0.09308672 -0.44906295] [-0.09032912 0.02835206 0.23229467 0.52259559 0.7339236 ]] [[ 0.87125391 0.87125391 0.62969772 0.87125391 0.87125391] [-0.866077 -0.866077 -0.2315663 -0.866077 -0.866077 ] [ 0.43897868 0.43897868 -0.13556222 0.43897868 0.43897868] [-0.866077 -0.866077 -0.2315663 -0.866077 -0.866077 ] [ 1.4268368 1.4268368 0.82118359 1.4268368 1.4268368 ]] [[-1.8158005 5.42159424 0.43562696] [ 1.21342645 -4.9949898 -0.4330385 ] [-0.23563535 2.27882052 0.21948934] [ 1.21342645 -4.9949898 -0.4330385 ] [-2.6586122 8.66878602 0.7134184 ]] [[ 0.87125391 -0.866077 0.43897868 -0.866077 1.4268368 ] [ 0.87125391 -0.866077 0.43897868 -0.866077 1.4268368 ] [ 0.62969772 -0.2315663 -0.13556222 -0.2315663 0.82118359] [ 0.87125391 -0.866077 0.43897868 -0.866077 1.4268368 ] [ 0.87125391 -0.866077 0.43897868 -0.866077 1.4268368 ]]
SGD update
**(g)** Assume that we use SGD with learning rate $\eta=0.01$ to update the model parameters. What are the values of $W^2, b^2$ and $W^1, b^1$ after updating?
Show your formula
#Show your code
learning_rate = 0.01
# Update W2 and b2 using SGD
W2 -= learning_rate * p2
b2 -= learning_rate * matr.reshape(1, -1).T
print ("W2:\n",W2,"\n","b2:\n",b2)
# Update W1 and b1 using SGD
W1 -= learning_rate * p3_f
# b1 -= learning_rate * np.sum(p4_f)
print ("W1:\n",W1,"\n","b1:\n",b1)
W2: [[0.729301 0.50671945 0.87949781 0.79949605 0.97970768] [0.85412348 0.52304093 0.67725963 0.31933795 0.2083518 ] [0.06961929 0.675661 0.62844156 0.87102426 0.19679131]] b2: [[0.11045054] [0.69475357] [0.50990836]] W1: [[0.80534526 0.6338631 0.59380351] [0.73795092 0.85351641 0.14549009] [0.82473595 0.72886885 0.22414081] [0.21397833 0.6319377 0.65097635] [0.84134583 0.38883133 0.02548392]] b1: [[0.31283007] [0.37217154] [0.81791331] [0.98338146] [0.26823072]]
The first part of this assignment is to demonstrate your basis knowledge in deep learning that you have acquired from the lectures and tutorials materials. Most of the contents in this assignment are drawn from the tutorials covered from weeks 1 to 4. Going through these materials before attempting this assignment is highly recommended.
In the first part of this assignment, you are going to work with the FashionMNIST dataset for image recognition task. It has the exact same format as MNIST (70,000 grayscale images of 28 × 28 pixels each with 10 classes), but the images represent fashion items rather than handwritten digits, so each class is more diverse, and the problem is significantly more challenging than MNIST.
We first use keras incoporated in TensorFlow 2.x for loading the training and testing sets.
import tensorflow as tf
from tensorflow import keras
tf.random.set_seed(1234)
We first use keras datasets in TF 2.x to load Fashion MNIST dataset.
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full_img, y_train_full), (X_test_img, y_test) = fashion_mnist.load_data()
The shape of X_train_full_img is $(60000, 28, 28 )$ and that of X_test_img is $(10000, 28, 28)$. We next convert them to matrices of vectors and store in X_train_full and X_test.
num_train = X_train_full_img.shape[0]
num_test = X_test_img.shape[0]
#Get X_train_full and X_test
X_train_full = X_train_full_img.reshape(num_train,-1)
X_test = X_test_img.reshape(num_test, -1)
#Print shape of test and train set
print("train set shape:\n",X_train_full.shape, y_train_full.shape)
print("test set shape: \n",X_test.shape, y_test.shape)
train set shape: (60000, 784) (60000,) test set shape: (10000, 784) (10000,)
You need to write the code to address the following requirements:
You have now the separate training, validation, and testing sets for training your model.
import math
N = X_train_full.shape[0]
i = math.floor(0.9*N)
n_classes= 10
shuffle = np.random.permutation(N)
#split and shuffle the dataset according to method taught in the tutorials
valid_idx = math.floor(0.1*N)
X_train, y_train = X_train_full[shuffle][:i],y_train_full[shuffle][:i]
X_valid, y_valid = X_train_full[shuffle][i:i+valid_idx],y_train_full[shuffle][i:i+valid_idx]
X_train, X_valid, X_test = X_train/255.0 , X_valid/255.0 , X_test/255.0
print ('Train set', X_train.shape, y_train.shape)
print('Validation set', X_valid.shape, y_valid.shape)
print('Test set', X_test.shape, y_test.shape)
Train set (54000, 784) (54000,) Validation set (6000, 784) (6000,) Test set (10000, 784) (10000,)
We now develop a feed-forward neural network with the architecture $784 \rightarrow 40(ReLU) \rightarrow 30(ReLU) \rightarrow 10(softmax)$. You can choose your own way to implement your network and an optimizer of interest. You should train model in $50$ epochs and evaluate the trained model on the test set.
Baseline Accuracy:
# Code adapted from tutorials
class DNN:
def __init__(self,n1,n2,act,n_classes=10, optimizer=tf.keras.optimizers.SGD(learning_rate=0.01),
batch_size=32, epochs=1, alpha=0.001):
self.n_classes = n_classes
self.batch_size = batch_size
self.epochs = epochs
self.optimizer = optimizer
self.alpha = alpha
self.n1 = n1
self.n2 = n2
self.act = act
# create a tensorflow dataset for training
self.train_set = tf.data.Dataset.from_tensor_slices((X_train, y_train))
# create a tensorflow dataset for validation
self.valid_set = tf.data.Dataset.from_tensor_slices((X_valid, y_valid))
# create a tensorflow dataset for validation
self.test_set = tf.data.Dataset.from_tensor_slices((X_test, y_test))
# batching train and valid sets
self.train_set = self.train_set.batch(self.batch_size).prefetch(1)
self.valid_set = self.valid_set.batch(self.batch_size).prefetch(1)
self.test_set = self.test_set.batch(self.batch_size).prefetch(1)
tf.keras.backend.set_floatx('float64')
def build(self):
self.model = tf.keras.Sequential([
tf.keras.layers.Dense(self.n1, activation= self.act),
tf.keras.layers.Dense(self.n2, activation= self.act),
tf.keras.layers.Dense(self.n_classes, activation='softmax')
])
def compute_loss(self, X, y): # X is data batch, y is label batch
pred_probs = self.model(X)
l1 = tf.keras.losses.sparse_categorical_crossentropy(y, pred_probs) # Cross entropy loss
l2 = tf.add_n([tf.nn.l2_loss(w) for w in self.model.trainable_weights])
l2 = tf.expand_dims(l2, axis=-1)
return l1 + self.alpha * l2
def compute_grads(self, X, y):
with tf.GradientTape() as g: # use gradient tape to compute gradients
loss = self.compute_loss(X, y)
grads = g.gradient(loss, self.model.trainable_variables) # compute gradients w.r.t. all trainable variables
return grads
def train_one_batch(self, X, y): # train in one batch
grads = self.compute_grads(X, y)
# the gradients will be applied according to optimizer for example SGD, Adam, and etc.
self.optimizer.apply_gradients(zip(grads, self.model.trainable_variables))
def evaluate(self, tf_dataset=None):
dataset_loss = tf.keras.metrics.Mean()
dataset_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
for X, y in tf_dataset:
loss = self.compute_loss(X, y)
dataset_loss.update_state(loss)
dataset_accuracy.update_state(y, self.model(X, training=False))
return dataset_loss.result(), dataset_accuracy.result()
def train(self):
for epoch in range(self.epochs):
for X, y in self.train_set: # use batch_index if you want to display something in iterations
self.train_one_batch(X, y)
train_loss, train_acc = self.evaluate(self.train_set)
valid_loss, valid_acc = self.evaluate(self.valid_set)
print('Epoch {}: train acc= {:.4f}, train loss= {:.4f} | valid acc= {:.4f}, valid loss= {:.4f}'.format(
epoch + 1, train_acc, train_loss, valid_acc, valid_loss))
return valid_acc,valid_loss
def evaluate_test_set(self, tf_dataset=None):
dataset_loss = tf.keras.metrics.Mean()
dataset_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
for X, y in self.test_set:
loss = self.compute_loss(X, y)
dataset_loss.update_state(loss)
dataset_accuracy.update_state(y, self.model(X, training=False))
return ('test acc = {:.4f}, test loss = {:.4f}'.format(
dataset_accuracy.result().numpy(), dataset_loss.result().numpy()))
opt = tf.keras.optimizers.Adam()
dnn = DNN(40,30,'relu',optimizer=opt, epochs=50, batch_size=64)
dnn.build()
dnn.train()
dnn.evaluate_test_set()
Epoch 1: train acc= 0.8448, train loss= 0.5111 | valid acc= 0.8433, valid loss= 0.5197 Epoch 2: train acc= 0.8578, train loss= 0.4694 | valid acc= 0.8518, valid loss= 0.4884 Epoch 3: train acc= 0.8651, train loss= 0.4498 | valid acc= 0.8565, valid loss= 0.4749 Epoch 4: train acc= 0.8698, train loss= 0.4372 | valid acc= 0.8622, valid loss= 0.4687 Epoch 5: train acc= 0.8728, train loss= 0.4290 | valid acc= 0.8648, valid loss= 0.4646 Epoch 6: train acc= 0.8756, train loss= 0.4203 | valid acc= 0.8663, valid loss= 0.4598 Epoch 7: train acc= 0.8780, train loss= 0.4150 | valid acc= 0.8672, valid loss= 0.4557 Epoch 8: train acc= 0.8811, train loss= 0.4090 | valid acc= 0.8693, valid loss= 0.4532 Epoch 9: train acc= 0.8834, train loss= 0.4041 | valid acc= 0.8707, valid loss= 0.4501 Epoch 10: train acc= 0.8849, train loss= 0.4018 | valid acc= 0.8690, valid loss= 0.4492 Epoch 11: train acc= 0.8873, train loss= 0.3955 | valid acc= 0.8723, valid loss= 0.4443 Epoch 12: train acc= 0.8887, train loss= 0.3925 | valid acc= 0.8720, valid loss= 0.4431 Epoch 13: train acc= 0.8899, train loss= 0.3911 | valid acc= 0.8723, valid loss= 0.4440 Epoch 14: train acc= 0.8918, train loss= 0.3861 | valid acc= 0.8757, valid loss= 0.4394 Epoch 15: train acc= 0.8921, train loss= 0.3862 | valid acc= 0.8738, valid loss= 0.4398 Epoch 16: train acc= 0.8929, train loss= 0.3834 | valid acc= 0.8745, valid loss= 0.4371 Epoch 17: train acc= 0.8921, train loss= 0.3847 | valid acc= 0.8722, valid loss= 0.4393 Epoch 18: train acc= 0.8922, train loss= 0.3847 | valid acc= 0.8745, valid loss= 0.4420 Epoch 19: train acc= 0.8940, train loss= 0.3799 | valid acc= 0.8755, valid loss= 0.4372 Epoch 20: train acc= 0.8936, train loss= 0.3818 | valid acc= 0.8743, valid loss= 0.4398 Epoch 21: train acc= 0.8925, train loss= 0.3861 | valid acc= 0.8732, valid loss= 0.4425 Epoch 22: train acc= 0.8932, train loss= 0.3828 | valid acc= 0.8722, valid loss= 0.4410 Epoch 23: train acc= 0.8943, train loss= 0.3817 | valid acc= 0.8737, valid loss= 0.4413 Epoch 24: train acc= 0.8942, train loss= 0.3820 | valid acc= 0.8728, valid loss= 0.4422 Epoch 25: train acc= 0.8960, train loss= 0.3772 | valid acc= 0.8747, valid loss= 0.4386 Epoch 26: train acc= 0.8963, train loss= 0.3771 | valid acc= 0.8725, valid loss= 0.4370 Epoch 27: train acc= 0.8964, train loss= 0.3754 | valid acc= 0.8733, valid loss= 0.4376 Epoch 28: train acc= 0.8956, train loss= 0.3781 | valid acc= 0.8717, valid loss= 0.4388 Epoch 29: train acc= 0.8966, train loss= 0.3741 | valid acc= 0.8742, valid loss= 0.4360 Epoch 30: train acc= 0.8960, train loss= 0.3758 | valid acc= 0.8753, valid loss= 0.4384 Epoch 31: train acc= 0.8949, train loss= 0.3793 | valid acc= 0.8742, valid loss= 0.4425 Epoch 32: train acc= 0.8957, train loss= 0.3771 | valid acc= 0.8737, valid loss= 0.4409 Epoch 33: train acc= 0.8982, train loss= 0.3739 | valid acc= 0.8725, valid loss= 0.4397 Epoch 34: train acc= 0.8965, train loss= 0.3753 | valid acc= 0.8740, valid loss= 0.4408 Epoch 35: train acc= 0.8964, train loss= 0.3760 | valid acc= 0.8732, valid loss= 0.4419 Epoch 36: train acc= 0.8979, train loss= 0.3733 | valid acc= 0.8753, valid loss= 0.4400 Epoch 37: train acc= 0.8978, train loss= 0.3753 | valid acc= 0.8723, valid loss= 0.4420 Epoch 38: train acc= 0.8960, train loss= 0.3793 | valid acc= 0.8735, valid loss= 0.4455 Epoch 39: train acc= 0.8985, train loss= 0.3732 | valid acc= 0.8742, valid loss= 0.4417 Epoch 40: train acc= 0.8986, train loss= 0.3735 | valid acc= 0.8742, valid loss= 0.4427 Epoch 41: train acc= 0.8984, train loss= 0.3719 | valid acc= 0.8762, valid loss= 0.4397 Epoch 42: train acc= 0.8995, train loss= 0.3694 | valid acc= 0.8757, valid loss= 0.4373 Epoch 43: train acc= 0.8996, train loss= 0.3710 | valid acc= 0.8753, valid loss= 0.4415 Epoch 44: train acc= 0.8998, train loss= 0.3714 | valid acc= 0.8752, valid loss= 0.4422 Epoch 45: train acc= 0.9011, train loss= 0.3692 | valid acc= 0.8760, valid loss= 0.4396 Epoch 46: train acc= 0.9007, train loss= 0.3698 | valid acc= 0.8770, valid loss= 0.4419 Epoch 47: train acc= 0.9011, train loss= 0.3684 | valid acc= 0.8748, valid loss= 0.4407 Epoch 48: train acc= 0.9016, train loss= 0.3681 | valid acc= 0.8780, valid loss= 0.4394 Epoch 49: train acc= 0.9010, train loss= 0.3707 | valid acc= 0.8752, valid loss= 0.4425 Epoch 50: train acc= 0.9019, train loss= 0.3674 | valid acc= 0.8765, valid loss= 0.4411
'test acc = 0.8721, test loss = 0.4527'
Assume that you need to tune the number of neurons on the first and second hidden layers $n_1 \in \{20, 40\}$, $n_2 \in \{20, 40\}$ and the used activation function $act \in \{sigmoid, tanh, relu\}$. The network has the architecture pattern $784 \rightarrow n_1 (act) \rightarrow n_2(act) \rightarrow 10(softmax)$ where $n_1, n_2$, and $act$ are in their grides. Write the code to tune the hyper-parameters $n_1, n_2$, and $act$. Note that you can freely choose the optimizer and learning rate of interest for this task.
For this question, I have decided to use the original SGD optimizer with the learning rate of 0.001 in order to test it. In the output space below, you will find the results of each model configuration running 10 epochs each. The final result shows n1= 40, n2 = 40, activation function = tanh as the best configuration.
Best Config (40,40,tanh) Accuracy after 10 epochs:
# Perform grid search using each possible configurtaions, compute 10 epochs and save the best values / configuration
lst_n1 = [20,40]
lst_n2 = [20,40]
lst_activation = ['sigmoid','tanh','relu']
best_acc= - np.inf
best_history = None
best_model = None
for n1 in lst_n1:
for n2 in lst_n2:
for act in lst_activation:
print('\nThis is for model n1= {}, n2 = {}, activation function = {}'.format(n1,n2, act))
opt = tf.keras.optimizers.SGD(learning_rate=0.001)
dnn = DNN(n1,n2,act,optimizer=opt, epochs=10, batch_size=64)
dnn.build()
valid_acc, valid_loss = dnn.train()
print('\tvalid acc = {}, valid loss = {}'.format(valid_acc, valid_loss))
if(valid_acc > best_acc):
best_model = dnn
best_acc = valid_acc
best_n1 = n1
best_n2 = n2
best_act = act
print('\nThe best model is with n1= {}, n2 = {}, activation function = {}'.format(best_n1,best_n2, best_act))
This is for model n1= 20, n2 = 20, activation function = sigmoid Epoch 1: train acc= 0.4086, train loss= 1.9378 | valid acc= 0.4068, valid loss= 1.9373 Epoch 2: train acc= 0.5539, train loss= 1.7302 | valid acc= 0.5563, valid loss= 1.7289 Epoch 3: train acc= 0.5677, train loss= 1.6562 | valid acc= 0.5697, valid loss= 1.6547 Epoch 4: train acc= 0.5812, train loss= 1.6286 | valid acc= 0.5822, valid loss= 1.6269 Epoch 5: train acc= 0.5960, train loss= 1.6133 | valid acc= 0.5995, valid loss= 1.6115 Epoch 6: train acc= 0.6110, train loss= 1.6030 | valid acc= 0.6090, valid loss= 1.6011 Epoch 7: train acc= 0.6218, train loss= 1.5955 | valid acc= 0.6195, valid loss= 1.5935 Epoch 8: train acc= 0.6276, train loss= 1.5899 | valid acc= 0.6243, valid loss= 1.5878 Epoch 9: train acc= 0.6314, train loss= 1.5858 | valid acc= 0.6273, valid loss= 1.5837 Epoch 10: train acc= 0.6340, train loss= 1.5828 | valid acc= 0.6305, valid loss= 1.5806 valid acc = 0.6305, valid loss = 1.5805526927139606 This is for model n1= 20, n2 = 20, activation function = tanh Epoch 1: train acc= 0.8189, train loss= 0.8983 | valid acc= 0.8262, valid loss= 0.8904 Epoch 2: train acc= 0.8326, train loss= 0.8179 | valid acc= 0.8373, valid loss= 0.8120 Epoch 3: train acc= 0.8372, train loss= 0.7907 | valid acc= 0.8422, valid loss= 0.7857 Epoch 4: train acc= 0.8400, train loss= 0.7792 | valid acc= 0.8445, valid loss= 0.7746 Epoch 5: train acc= 0.8412, train loss= 0.7733 | valid acc= 0.8455, valid loss= 0.7689 Epoch 6: train acc= 0.8424, train loss= 0.7697 | valid acc= 0.8468, valid loss= 0.7655 Epoch 7: train acc= 0.8426, train loss= 0.7674 | valid acc= 0.8472, valid loss= 0.7633 Epoch 8: train acc= 0.8428, train loss= 0.7657 | valid acc= 0.8482, valid loss= 0.7617 Epoch 9: train acc= 0.8434, train loss= 0.7644 | valid acc= 0.8480, valid loss= 0.7605 Epoch 10: train acc= 0.8436, train loss= 0.7634 | valid acc= 0.8483, valid loss= 0.7595 valid acc = 0.8483333333333334, valid loss = 0.75954389014383 This is for model n1= 20, n2 = 20, activation function = relu Epoch 1: train acc= 0.8170, train loss= 0.8341 | valid acc= 0.8268, valid loss= 0.8276 Epoch 2: train acc= 0.8306, train loss= 0.7501 | valid acc= 0.8380, valid loss= 0.7463 Epoch 3: train acc= 0.8346, train loss= 0.7235 | valid acc= 0.8392, valid loss= 0.7208 Epoch 4: train acc= 0.8372, train loss= 0.7125 | valid acc= 0.8390, valid loss= 0.7106 Epoch 5: train acc= 0.8389, train loss= 0.7064 | valid acc= 0.8413, valid loss= 0.7046 Epoch 6: train acc= 0.8394, train loss= 0.7040 | valid acc= 0.8418, valid loss= 0.7023 Epoch 7: train acc= 0.8397, train loss= 0.7021 | valid acc= 0.8438, valid loss= 0.7004 Epoch 8: train acc= 0.8396, train loss= 0.7016 | valid acc= 0.8447, valid loss= 0.6999 Epoch 9: train acc= 0.8402, train loss= 0.7012 | valid acc= 0.8432, valid loss= 0.6995 Epoch 10: train acc= 0.8412, train loss= 0.6974 | valid acc= 0.8462, valid loss= 0.6956 valid acc = 0.8461666666666666, valid loss = 0.6956203009179754 This is for model n1= 20, n2 = 40, activation function = sigmoid Epoch 1: train acc= 0.4802, train loss= 1.9050 | valid acc= 0.4842, valid loss= 1.9036 Epoch 2: train acc= 0.5258, train loss= 1.6948 | valid acc= 0.5308, valid loss= 1.6902 Epoch 3: train acc= 0.5655, train loss= 1.6388 | valid acc= 0.5692, valid loss= 1.6338 Epoch 4: train acc= 0.5841, train loss= 1.6130 | valid acc= 0.5882, valid loss= 1.6082 Epoch 5: train acc= 0.5939, train loss= 1.5965 | valid acc= 0.5987, valid loss= 1.5924 Epoch 6: train acc= 0.6021, train loss= 1.5845 | valid acc= 0.6070, valid loss= 1.5807 Epoch 7: train acc= 0.6102, train loss= 1.5751 | valid acc= 0.6158, valid loss= 1.5717 Epoch 8: train acc= 0.6185, train loss= 1.5659 | valid acc= 0.6202, valid loss= 1.5628 Epoch 9: train acc= 0.6275, train loss= 1.5558 | valid acc= 0.6260, valid loss= 1.5529 Epoch 10: train acc= 0.6371, train loss= 1.5455 | valid acc= 0.6362, valid loss= 1.5426 valid acc = 0.6361666666666667, valid loss = 1.542632225107603 This is for model n1= 20, n2 = 40, activation function = tanh Epoch 1: train acc= 0.8239, train loss= 0.8762 | valid acc= 0.8340, valid loss= 0.8689 Epoch 2: train acc= 0.8367, train loss= 0.7930 | valid acc= 0.8423, valid loss= 0.7863 Epoch 3: train acc= 0.8410, train loss= 0.7651 | valid acc= 0.8448, valid loss= 0.7587 Epoch 4: train acc= 0.8432, train loss= 0.7536 | valid acc= 0.8470, valid loss= 0.7473 Epoch 5: train acc= 0.8439, train loss= 0.7478 | valid acc= 0.8490, valid loss= 0.7416 Epoch 6: train acc= 0.8442, train loss= 0.7443 | valid acc= 0.8497, valid loss= 0.7383 Epoch 7: train acc= 0.8444, train loss= 0.7420 | valid acc= 0.8510, valid loss= 0.7361 Epoch 8: train acc= 0.8448, train loss= 0.7402 | valid acc= 0.8512, valid loss= 0.7345 Epoch 9: train acc= 0.8451, train loss= 0.7388 | valid acc= 0.8513, valid loss= 0.7332 Epoch 10: train acc= 0.8452, train loss= 0.7377 | valid acc= 0.8510, valid loss= 0.7321 valid acc = 0.851, valid loss = 0.7321288660559229 This is for model n1= 20, n2 = 40, activation function = relu Epoch 1: train acc= 0.8120, train loss= 0.8586 | valid acc= 0.8227, valid loss= 0.8473 Epoch 2: train acc= 0.8259, train loss= 0.7630 | valid acc= 0.8370, valid loss= 0.7529 Epoch 3: train acc= 0.8310, train loss= 0.7307 | valid acc= 0.8427, valid loss= 0.7214 Epoch 4: train acc= 0.8342, train loss= 0.7166 | valid acc= 0.8443, valid loss= 0.7083 Epoch 5: train acc= 0.8370, train loss= 0.7090 | valid acc= 0.8460, valid loss= 0.7018 Epoch 6: train acc= 0.8390, train loss= 0.7048 | valid acc= 0.8467, valid loss= 0.6986 Epoch 7: train acc= 0.8404, train loss= 0.7022 | valid acc= 0.8463, valid loss= 0.6969 Epoch 8: train acc= 0.8414, train loss= 0.7000 | valid acc= 0.8475, valid loss= 0.6954 Epoch 9: train acc= 0.8422, train loss= 0.6988 | valid acc= 0.8475, valid loss= 0.6947 Epoch 10: train acc= 0.8428, train loss= 0.6977 | valid acc= 0.8463, valid loss= 0.6940 valid acc = 0.8463333333333334, valid loss = 0.6939578115844145 This is for model n1= 40, n2 = 20, activation function = sigmoid Epoch 1: train acc= 0.5335, train loss= 1.8716 | valid acc= 0.5398, valid loss= 1.8687 Epoch 2: train acc= 0.5695, train loss= 1.6742 | valid acc= 0.5695, valid loss= 1.6713 Epoch 3: train acc= 0.5835, train loss= 1.6071 | valid acc= 0.5860, valid loss= 1.6042 Epoch 4: train acc= 0.5970, train loss= 1.5741 | valid acc= 0.6035, valid loss= 1.5710 Epoch 5: train acc= 0.6095, train loss= 1.5544 | valid acc= 0.6120, valid loss= 1.5511 Epoch 6: train acc= 0.6196, train loss= 1.5409 | valid acc= 0.6190, valid loss= 1.5375 Epoch 7: train acc= 0.6263, train loss= 1.5311 | valid acc= 0.6290, valid loss= 1.5276 Epoch 8: train acc= 0.6333, train loss= 1.5241 | valid acc= 0.6318, valid loss= 1.5206 Epoch 9: train acc= 0.6401, train loss= 1.5193 | valid acc= 0.6372, valid loss= 1.5158 Epoch 10: train acc= 0.6492, train loss= 1.5159 | valid acc= 0.6450, valid loss= 1.5124 valid acc = 0.645, valid loss = 1.5124126107648554 This is for model n1= 40, n2 = 20, activation function = tanh Epoch 1: train acc= 0.8199, train loss= 0.9488 | valid acc= 0.8318, valid loss= 0.9380 Epoch 2: train acc= 0.8319, train loss= 0.8268 | valid acc= 0.8392, valid loss= 0.8179 Epoch 3: train acc= 0.8369, train loss= 0.7848 | valid acc= 0.8425, valid loss= 0.7772 Epoch 4: train acc= 0.8397, train loss= 0.7686 | valid acc= 0.8455, valid loss= 0.7619 Epoch 5: train acc= 0.8416, train loss= 0.7615 | valid acc= 0.8470, valid loss= 0.7554 Epoch 6: train acc= 0.8431, train loss= 0.7577 | valid acc= 0.8475, valid loss= 0.7520 Epoch 7: train acc= 0.8439, train loss= 0.7553 | valid acc= 0.8485, valid loss= 0.7500 Epoch 8: train acc= 0.8446, train loss= 0.7537 | valid acc= 0.8485, valid loss= 0.7486 Epoch 9: train acc= 0.8449, train loss= 0.7525 | valid acc= 0.8500, valid loss= 0.7476 Epoch 10: train acc= 0.8451, train loss= 0.7516 | valid acc= 0.8502, valid loss= 0.7469 valid acc = 0.8501666666666666, valid loss = 0.7468579011961676 This is for model n1= 40, n2 = 20, activation function = relu Epoch 1: train acc= 0.8256, train loss= 0.8910 | valid acc= 0.8317, valid loss= 0.8841 Epoch 2: train acc= 0.8363, train loss= 0.7657 | valid acc= 0.8443, valid loss= 0.7600 Epoch 3: train acc= 0.8403, train loss= 0.7246 | valid acc= 0.8475, valid loss= 0.7197 Epoch 4: train acc= 0.8421, train loss= 0.7094 | valid acc= 0.8495, valid loss= 0.7051 Epoch 5: train acc= 0.8429, train loss= 0.7031 | valid acc= 0.8488, valid loss= 0.6995 Epoch 6: train acc= 0.8443, train loss= 0.6994 | valid acc= 0.8510, valid loss= 0.6962 Epoch 7: train acc= 0.8448, train loss= 0.6971 | valid acc= 0.8517, valid loss= 0.6942 Epoch 8: train acc= 0.8452, train loss= 0.6961 | valid acc= 0.8507, valid loss= 0.6935 Epoch 9: train acc= 0.8454, train loss= 0.6949 | valid acc= 0.8512, valid loss= 0.6924 Epoch 10: train acc= 0.8458, train loss= 0.6937 | valid acc= 0.8520, valid loss= 0.6914 valid acc = 0.852, valid loss = 0.6914116841091157 This is for model n1= 40, n2 = 40, activation function = sigmoid Epoch 1: train acc= 0.5500, train loss= 1.8929 | valid acc= 0.5500, valid loss= 1.8891 Epoch 2: train acc= 0.5796, train loss= 1.6515 | valid acc= 0.5815, valid loss= 1.6473 Epoch 3: train acc= 0.5964, train loss= 1.5792 | valid acc= 0.5997, valid loss= 1.5746 Epoch 4: train acc= 0.6082, train loss= 1.5494 | valid acc= 0.6098, valid loss= 1.5446 Epoch 5: train acc= 0.6198, train loss= 1.5339 | valid acc= 0.6212, valid loss= 1.5289 Epoch 6: train acc= 0.6316, train loss= 1.5238 | valid acc= 0.6360, valid loss= 1.5188 Epoch 7: train acc= 0.6455, train loss= 1.5157 | valid acc= 0.6468, valid loss= 1.5107 Epoch 8: train acc= 0.6604, train loss= 1.5071 | valid acc= 0.6618, valid loss= 1.5024 Epoch 9: train acc= 0.6738, train loss= 1.4965 | valid acc= 0.6722, valid loss= 1.4925 Epoch 10: train acc= 0.6830, train loss= 1.4858 | valid acc= 0.6805, valid loss= 1.4824 valid acc = 0.6805, valid loss = 1.4823851180197463 This is for model n1= 40, n2 = 40, activation function = tanh Epoch 1: train acc= 0.8302, train loss= 0.9330 | valid acc= 0.8393, valid loss= 0.9244 Epoch 2: train acc= 0.8391, train loss= 0.8040 | valid acc= 0.8458, valid loss= 0.7967 Epoch 3: train acc= 0.8428, train loss= 0.7601 | valid acc= 0.8472, valid loss= 0.7535 Epoch 4: train acc= 0.8444, train loss= 0.7431 | valid acc= 0.8487, valid loss= 0.7371 Epoch 5: train acc= 0.8456, train loss= 0.7355 | valid acc= 0.8493, valid loss= 0.7299 Epoch 6: train acc= 0.8462, train loss= 0.7313 | valid acc= 0.8503, valid loss= 0.7259 Epoch 7: train acc= 0.8467, train loss= 0.7286 | valid acc= 0.8508, valid loss= 0.7235 Epoch 8: train acc= 0.8469, train loss= 0.7267 | valid acc= 0.8513, valid loss= 0.7217 Epoch 9: train acc= 0.8475, train loss= 0.7251 | valid acc= 0.8522, valid loss= 0.7203 Epoch 10: train acc= 0.8478, train loss= 0.7237 | valid acc= 0.8525, valid loss= 0.7190 valid acc = 0.8525, valid loss = 0.7190313472134627 This is for model n1= 40, n2 = 40, activation function = relu Epoch 1: train acc= 0.8209, train loss= 0.9281 | valid acc= 0.8328, valid loss= 0.9171 Epoch 2: train acc= 0.8326, train loss= 0.7785 | valid acc= 0.8440, valid loss= 0.7703 Epoch 3: train acc= 0.8372, train loss= 0.7293 | valid acc= 0.8467, valid loss= 0.7229 Epoch 4: train acc= 0.8393, train loss= 0.7111 | valid acc= 0.8477, valid loss= 0.7055 Epoch 5: train acc= 0.8405, train loss= 0.7035 | valid acc= 0.8482, valid loss= 0.6980 Epoch 6: train acc= 0.8413, train loss= 0.6995 | valid acc= 0.8488, valid loss= 0.6945 Epoch 7: train acc= 0.8419, train loss= 0.6969 | valid acc= 0.8487, valid loss= 0.6922 Epoch 8: train acc= 0.8432, train loss= 0.6946 | valid acc= 0.8490, valid loss= 0.6902 Epoch 9: train acc= 0.8437, train loss= 0.6931 | valid acc= 0.8497, valid loss= 0.6888 Epoch 10: train acc= 0.8443, train loss= 0.6920 | valid acc= 0.8498, valid loss= 0.6879 valid acc = 0.8498333333333333, valid loss = 0.6879278224979427 The best model is with n1= 40, n2 = 40, activation function = tanh
best_model.model.save('models/best_cnn.h5')
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
Sharpness-aware minimization (SAM) (i.e., link for main paper from Google Deepmind) is a simple yet but efficient technique to improve the generalization ability of deep learning models on unseen data examples. In your research or your work, you might potentially use this idea. Your task is to read the paper and implement Sharpness-aware minimization (SAM). Finally, you need to apply SAM to the best architecture found in Question 2.4.
After using SAM, we can see that our model improved marginally when compared with the accuracies of the best config above.
Best Config (40,40,tanh) Accuracy without SAM:
Best Config (40,40,tanh) Accuracy with SAM:
Our model with SAM is able to generalize better as reflected in the training and validation accuracy since we apply the gradients twice at each turn (once with the default values / gradients and another with gradients after applying sharpness scaling).
class SAM(tf.keras.optimizers.Optimizer):
def __init__(self, base_optimizer, rho=0.05):
super(SAM, self).__init__(name="SAM")
self.base_optimizer = base_optimizer
self.rho = 0.05
self.epsilon = 1e-12
def apply_gradients (self,zipped_grads_and_trainable_variables):
# unzip the values taken in from the model
grads, trainable = zip(*zipped_grads_and_trainable_variables)
#zip it back and pass it into our base optimizer (Adam in this case)
self.base_optimizer.apply_gradients(zip(grads,trainable))
#initialize sharpness value
sharpness = 0.0
#compute the total sharpness value by taking the square of each gradients and reduce it into a single value
for grad in grads:
sharpness += tf.reduce_sum(tf.square(grad))
#initalize scaled_gradients array
scaled_grad = []
#apply sharpness to each gradient value
for grad in grads:
scaled_grad.append(self.rho * grad / (sharpness + self.epsilon))
#reapply gradients with the base optimizer with the sharpness adjusted values
self.base_optimizer.apply_gradients(zip(scaled_grad,trainable))
sam = SAM(opt)
opt = tf.keras.optimizers.Adam()
dnn = DNN(40,40,'tanh',optimizer=sam, epochs=50, batch_size=32)
dnn.build()
dnn.train()
dnn.evaluate_test_set()
Epoch 1: train acc= 0.8462, train loss= 0.5069 | valid acc= 0.8380, valid loss= 0.5244 Epoch 2: train acc= 0.8607, train loss= 0.4787 | valid acc= 0.8503, valid loss= 0.5011 Epoch 3: train acc= 0.8665, train loss= 0.4675 | valid acc= 0.8552, valid loss= 0.4931 Epoch 4: train acc= 0.8694, train loss= 0.4604 | valid acc= 0.8555, valid loss= 0.4882 Epoch 5: train acc= 0.8709, train loss= 0.4574 | valid acc= 0.8582, valid loss= 0.4876 Epoch 6: train acc= 0.8725, train loss= 0.4542 | valid acc= 0.8590, valid loss= 0.4862 Epoch 7: train acc= 0.8724, train loss= 0.4551 | valid acc= 0.8582, valid loss= 0.4887 Epoch 8: train acc= 0.8725, train loss= 0.4551 | valid acc= 0.8582, valid loss= 0.4901 Epoch 9: train acc= 0.8746, train loss= 0.4503 | valid acc= 0.8605, valid loss= 0.4855 Epoch 10: train acc= 0.8751, train loss= 0.4510 | valid acc= 0.8605, valid loss= 0.4865 Epoch 11: train acc= 0.8760, train loss= 0.4496 | valid acc= 0.8617, valid loss= 0.4848 Epoch 12: train acc= 0.8762, train loss= 0.4482 | valid acc= 0.8630, valid loss= 0.4836 Epoch 13: train acc= 0.8771, train loss= 0.4483 | valid acc= 0.8628, valid loss= 0.4825 Epoch 14: train acc= 0.8758, train loss= 0.4510 | valid acc= 0.8597, valid loss= 0.4860 Epoch 15: train acc= 0.8776, train loss= 0.4484 | valid acc= 0.8617, valid loss= 0.4842 Epoch 16: train acc= 0.8778, train loss= 0.4465 | valid acc= 0.8623, valid loss= 0.4826 Epoch 17: train acc= 0.8796, train loss= 0.4409 | valid acc= 0.8623, valid loss= 0.4777 Epoch 18: train acc= 0.8803, train loss= 0.4405 | valid acc= 0.8638, valid loss= 0.4774 Epoch 19: train acc= 0.8801, train loss= 0.4418 | valid acc= 0.8633, valid loss= 0.4792 Epoch 20: train acc= 0.8805, train loss= 0.4405 | valid acc= 0.8637, valid loss= 0.4780 Epoch 21: train acc= 0.8815, train loss= 0.4393 | valid acc= 0.8628, valid loss= 0.4775 Epoch 22: train acc= 0.8814, train loss= 0.4401 | valid acc= 0.8648, valid loss= 0.4783 Epoch 23: train acc= 0.8818, train loss= 0.4391 | valid acc= 0.8640, valid loss= 0.4774 Epoch 24: train acc= 0.8816, train loss= 0.4390 | valid acc= 0.8653, valid loss= 0.4779 Epoch 25: train acc= 0.8820, train loss= 0.4383 | valid acc= 0.8655, valid loss= 0.4771 Epoch 26: train acc= 0.8817, train loss= 0.4373 | valid acc= 0.8660, valid loss= 0.4765 Epoch 27: train acc= 0.8823, train loss= 0.4361 | valid acc= 0.8653, valid loss= 0.4758 Epoch 28: train acc= 0.8823, train loss= 0.4363 | valid acc= 0.8662, valid loss= 0.4758 Epoch 29: train acc= 0.8833, train loss= 0.4341 | valid acc= 0.8665, valid loss= 0.4742 Epoch 30: train acc= 0.8826, train loss= 0.4358 | valid acc= 0.8657, valid loss= 0.4762 Epoch 31: train acc= 0.8832, train loss= 0.4335 | valid acc= 0.8668, valid loss= 0.4739 Epoch 32: train acc= 0.8827, train loss= 0.4345 | valid acc= 0.8665, valid loss= 0.4757 Epoch 33: train acc= 0.8823, train loss= 0.4349 | valid acc= 0.8658, valid loss= 0.4762 Epoch 34: train acc= 0.8841, train loss= 0.4320 | valid acc= 0.8683, valid loss= 0.4740 Epoch 35: train acc= 0.8850, train loss= 0.4310 | valid acc= 0.8665, valid loss= 0.4732 Epoch 36: train acc= 0.8837, train loss= 0.4327 | valid acc= 0.8667, valid loss= 0.4753 Epoch 37: train acc= 0.8845, train loss= 0.4306 | valid acc= 0.8682, valid loss= 0.4734 Epoch 38: train acc= 0.8836, train loss= 0.4328 | valid acc= 0.8668, valid loss= 0.4755 Epoch 39: train acc= 0.8843, train loss= 0.4307 | valid acc= 0.8675, valid loss= 0.4735 Epoch 40: train acc= 0.8848, train loss= 0.4294 | valid acc= 0.8660, valid loss= 0.4718 Epoch 41: train acc= 0.8846, train loss= 0.4307 | valid acc= 0.8650, valid loss= 0.4735 Epoch 42: train acc= 0.8836, train loss= 0.4318 | valid acc= 0.8658, valid loss= 0.4750 Epoch 43: train acc= 0.8839, train loss= 0.4317 | valid acc= 0.8653, valid loss= 0.4754 Epoch 44: train acc= 0.8841, train loss= 0.4312 | valid acc= 0.8657, valid loss= 0.4748 Epoch 45: train acc= 0.8832, train loss= 0.4332 | valid acc= 0.8638, valid loss= 0.4768 Epoch 46: train acc= 0.8835, train loss= 0.4337 | valid acc= 0.8630, valid loss= 0.4770 Epoch 47: train acc= 0.8835, train loss= 0.4315 | valid acc= 0.8652, valid loss= 0.4751 Epoch 48: train acc= 0.8836, train loss= 0.4318 | valid acc= 0.8652, valid loss= 0.4751 Epoch 49: train acc= 0.8838, train loss= 0.4322 | valid acc= 0.8653, valid loss= 0.4757 Epoch 50: train acc= 0.8829, train loss= 0.4339 | valid acc= 0.8647, valid loss= 0.4777
'test acc = 0.8616, test loss = 0.4981'
This part of the asssignment is designed to assess your knowledge and coding skill with Tensorflow as well as hands-on experience with training Convolutional Neural Network (CNN).
The dataset used for this part is a specific dataset for this unit consisting of approximately $10,000$ images of $20$ classes, each of which has approximately 500 images. You can download the dataset at download here and then decompress to the folder datasets\FIT5215_Dataset in your assignment folder.
Your task is to build a CNN model using TF 2.x to classify the images. You're provided with the module models.py, which you can find in the assignment folder, with some of the following classes:
DatasetManager: Support with loading and spliting the dataset into the train-val-test sets. It also supports generating next batches for training. DatasetManager will be passed to CNN model for training and testing.DefaultModel: A base class for the CNN model.YourModel: The class you'll need to implement for building your CNN model. It inherits some useful attributes and functions from the base class DefaultModelmodels.py file for your purposes.Firstly, we need to run the following cells to load and preprocess the FIT5215 dataset.
%load_ext autoreload
%autoreload 2
Install the package imutils if you have not installed yet
! pip install imutils
Requirement already satisfied: imutils in c:\users\manut\anaconda3\envs\gpu\lib\site-packages (0.5.4)
import os
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline
import models
from models import SimplePreprocessor, DatasetManager, DefaultModel
def create_label_folder_dict(adir):
sub_folders= [folder for folder in os.listdir(adir)
if os.path.isdir(os.path.join(adir, folder))]
label_folder_dict= dict()
for folder in sub_folders:
item= {folder: os.path.abspath(os.path.join(adir, folder))}
label_folder_dict.update(item)
return label_folder_dict
label_folder_dict= create_label_folder_dict("./datasets/FIT5215_Dataset")
The below code helps to create a data manager that contains all relevant methods used to manage and process the experimental data.
sp = SimplePreprocessor(width=32, height=32)
data_manager = DatasetManager([sp])
data_manager.load(label_folder_dict, verbose=100)
data_manager.process_data_label()
data_manager.train_valid_test_split()
birds 512 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 bottles 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 breads 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 butterfiles 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 cakes 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 cats 501 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 chickens 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 cows 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 dogs 501 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 ducks 496 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 elephants 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 fishes 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 handguns 448 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 horses 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 lions 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 lipsticks 400 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 seals 448 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 snakes 496 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 spiders 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 vases 368 Processed 100/500 Processed 200/500 Processed 300/500
Note that the object data_manager has the attributes relating to the training, validation, and testing sets as shown belows. You can use them in training your developped models in the sequel.
print(data_manager.X_train.shape, data_manager.y_train.shape)
print(data_manager.X_valid.shape, data_manager.y_valid.shape)
print(data_manager.X_test.shape, data_manager.y_test.shape)
print(data_manager.classes)
(7560, 32, 32, 3) (7560,) (946, 32, 32, 3) (946,) (946, 32, 32, 3) (946,) ['birds' 'bottles' 'breads' 'butterfiles' 'cakes' 'cats' 'chickens' 'cows' 'dogs' 'ducks' 'elephants' 'fishes' 'handguns' 'horses' 'lions' 'lipsticks' 'seals' 'snakes' 'spiders' 'vases']
We now run the default model built in the models.py file which serves as a basic baseline to start the investigation. Follow the following steps to realize how to run a model and know the built-in methods associated to a model developped in the DefaultModel class.
We first initialize a default model from the DefaultModel class. Basically, we can define the relevant parameters of training a model including num_classes, optimizer, learning_rate, batch_size, and num_epochs.
network1 = DefaultModel(name='network1',
num_classes=len(data_manager.classes),
optimizer='sgd',
batch_size= 128,
num_epochs = 20,
learning_rate=0.1)
The method build_cnn() assists us in building your convolutional neural network. You can view the code (in the models.py file) of the model behind a default model to realize how simple it is. Additionally, the method summary() shows the architecture of a model.
network1.build_cnn()
network1.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 32, 32, 32) 896
conv2d_1 (Conv2D) (None, 32, 32, 32) 9248
average_pooling2d (AverageP (None, 16, 16, 32) 0
ooling2D)
conv2d_2 (Conv2D) (None, 16, 16, 64) 18496
conv2d_3 (Conv2D) (None, 16, 16, 64) 36928
average_pooling2d_1 (Averag (None, 8, 8, 64) 0
ePooling2D)
flatten (Flatten) (None, 4096) 0
dense_3 (Dense) (None, 20) 81940
=================================================================
Total params: 147,508
Trainable params: 147,508
Non-trainable params: 0
_________________________________________________________________
None
To train a model regarding to the datasets stored in data_manager, you can invoke the method fit() for which you can specify the batch size and number of epochs for your training.
network1.fit(data_manager, batch_size = 64, num_epochs = 20)
Epoch 1/20 119/119 [==============================] - 10s 42ms/step - loss: 2.8829 - accuracy: 0.1139 - val_loss: 2.9179 - val_accuracy: 0.1226 Epoch 2/20 119/119 [==============================] - 5s 40ms/step - loss: 2.5422 - accuracy: 0.2311 - val_loss: 2.8218 - val_accuracy: 0.1797 Epoch 3/20 119/119 [==============================] - 5s 39ms/step - loss: 2.3153 - accuracy: 0.2954 - val_loss: 3.2079 - val_accuracy: 0.1723 Epoch 4/20 119/119 [==============================] - 5s 39ms/step - loss: 2.2063 - accuracy: 0.3272 - val_loss: 3.3495 - val_accuracy: 0.1734 Epoch 5/20 119/119 [==============================] - 5s 40ms/step - loss: 2.0907 - accuracy: 0.3567 - val_loss: 2.5826 - val_accuracy: 0.2093 Epoch 6/20 119/119 [==============================] - 5s 40ms/step - loss: 1.9664 - accuracy: 0.3976 - val_loss: 2.9541 - val_accuracy: 0.2178 Epoch 7/20 119/119 [==============================] - 5s 40ms/step - loss: 1.8645 - accuracy: 0.4266 - val_loss: 2.5988 - val_accuracy: 0.2347 Epoch 8/20 119/119 [==============================] - 5s 40ms/step - loss: 1.7803 - accuracy: 0.4540 - val_loss: 2.3764 - val_accuracy: 0.2801 Epoch 9/20 119/119 [==============================] - 5s 40ms/step - loss: 1.6480 - accuracy: 0.4926 - val_loss: 6.8173 - val_accuracy: 0.1564 Epoch 10/20 119/119 [==============================] - 5s 40ms/step - loss: 1.7674 - accuracy: 0.4706 - val_loss: 2.4231 - val_accuracy: 0.3277 Epoch 11/20 119/119 [==============================] - 5s 41ms/step - loss: 1.4942 - accuracy: 0.5414 - val_loss: 2.8209 - val_accuracy: 0.2928 Epoch 12/20 119/119 [==============================] - 5s 41ms/step - loss: 1.3854 - accuracy: 0.5694 - val_loss: 2.6143 - val_accuracy: 0.3510 Epoch 13/20 119/119 [==============================] - 5s 40ms/step - loss: 1.2694 - accuracy: 0.6041 - val_loss: 2.4816 - val_accuracy: 0.3552 Epoch 14/20 119/119 [==============================] - 5s 41ms/step - loss: 1.1388 - accuracy: 0.6496 - val_loss: 2.6341 - val_accuracy: 0.3510 Epoch 15/20 119/119 [==============================] - 5s 41ms/step - loss: 1.0294 - accuracy: 0.6776 - val_loss: 2.6812 - val_accuracy: 0.3721 Epoch 16/20 119/119 [==============================] - 5s 41ms/step - loss: 0.9249 - accuracy: 0.7063 - val_loss: 2.9345 - val_accuracy: 0.3689 Epoch 17/20 119/119 [==============================] - 5s 42ms/step - loss: 0.8156 - accuracy: 0.7398 - val_loss: 3.0814 - val_accuracy: 0.3615 Epoch 18/20 119/119 [==============================] - 5s 43ms/step - loss: 0.7208 - accuracy: 0.7713 - val_loss: 3.7819 - val_accuracy: 0.3319 Epoch 19/20 119/119 [==============================] - 5s 42ms/step - loss: 0.6296 - accuracy: 0.8017 - val_loss: 4.1202 - val_accuracy: 0.3214 Epoch 20/20 119/119 [==============================] - 5s 42ms/step - loss: 0.5418 - accuracy: 0.8250 - val_loss: 4.7613 - val_accuracy: 0.3129
Here you can compute the accuracy of your trained model with respect to a separate testing set.
network1.compute_accuracy(data_manager.X_test, data_manager.y_test)
15/15 [==============================] - 0s 13ms/step - loss: 4.8948 - accuracy: 0.3055
0.3054968287526427
Below shows how you can inspect the training progress.
network1.plot_progress()
You can use the method predict() to predict labels for data examples in a test set.
network1.predict(data_manager.X_test[0:10])
1/1 [==============================] - 0s 103ms/step
array([ 2, 11, 9, 18, 15, 9, 5, 0, 9, 10], dtype=int64)
Finally, the method plot_prediction() visualizes the predictions for a test set in which several images are chosen to show the predictions.
network1.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 7ms/step
<Figure size 640x480 with 0 Axes>
For questions 3.1 to 3.7, you'll need to write your own model in a way that makes it easy for you to experiment with different architectures and parameters. The goal is to be able to pass the parameters to initialize a new instance of YourModel to build different network architectures with different parameters. Below are descriptions of some parameters for YourModel:
Block architecture: Each block has the pattern [conv, batch norm, activation, conv, batch norm, activation, mean pool]. All convolutional layers have filter size $(3, 3)$, strides $(1, 1)$ and 'SAME' padding, and all mean pool layers have strides $(2, 2)$ and 'SAME' padding. The network will consists of a few blocks before applying a global average pooling (GAP) layer to obtain vectors and then a dense layer to output the logits for the softmax layer.When designing a block, there must have some instance variables as follows
num_channels: the number of channels used in a block, which will be applied to two Convs in the block.
mean_pool (True, False): the mean pool is used not. If mean_pool = True, it is used to downsample the input by two.
batch_norm (True, False): the batch normalization function is used or not. Setting batch_norm to False means not using batch normalization.
The skip connection (True, False) is added to the output of the second batch norm. Additionally, your class has a boolean property (i.e., instance variable) named use_skip. If use_skip=True, the skip connectnion is enable. Otherwise, if use_skip=False, the skip connectnion is disable.
Below is the architecture of one block:

Below is the architecture of the entire deep net with two blocks:

The above network has two blocks with the numbers of channels are 16 and 32 respectively. We apply a global average pooling (GAP) layer to flattern the output of the last block, followed by an output layer for prediction.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.layers import GlobalAveragePooling2D
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping
tf.random.set_seed(1234)
**Question 3.1** Write the code of the YourModel class here. Note that this class will inherit from the DefaultModel class. You'll only need to re-write the code for the build_cnn method in the YourModel class from the cell below. Note that the YourModel class is inherited from the DefaultModel class.
Baseline Accuracy:
# model class adapted from tutorials
class YourModel(DefaultModel):
def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
name='network1',
width=32, height=32, depth=3,
num_classes=20,
is_augmentation = False,
activation_func='relu',
optimizer='adam',
batch_size=128,
num_epochs= 30):
super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation,
activation_func, optimizer, batch_size, num_epochs,
learning_rate, verbose)
self.num_channels = num_channels
self.mean_pool = mean_pool
self.batch_norm = batch_norm
self.use_skip = use_skip
self.blocks = blocks
def build_cnn(self,x):
# Resblock code for builing each block
# x1 takes the inputs of x and passes it through a conv layer
x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
# if batch_norm is true, pass x1 through a batch norm layer
if self.batch_norm:
x1 = layers.BatchNormalization() (x1)
# apply activation function to x1
x1 = layers.Activation('relu') (x1)
# pass values of x1 into a conv layer
x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
#if value of x1 is
if self.batch_norm:
x2 = layers.BatchNormalization()(x2)
if x.shape != x2.shape:
if x2.shape[3] > x.shape[3]:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
else:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
x2_skip = layers.add([x, x2])
x2_skip = layers.Activation('relu')(x2_skip)
if self.mean_pool:
output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
else:
output_layer = x2_skip
return output_layer
def build_resnet(self):
self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
x = self.input_layer
for i in range (self.blocks):
x = self.build_cnn(x)
self.num_channels = self.num_channels*2
output_layer = GlobalAveragePooling2D()(x)
output_layer = layers.Flatten()(output_layer)
output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
Now run your model with a specific configuration.
#Your run here
# num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose
testModel = YourModel(16,2,True,True,True,0.001,True)
testModel.build_resnet()
testModel.summary()
testModel.fit(data_manager, batch_size = 16, num_epochs = 20)
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 32, 32, 3)] 0 []
conv2d_4 (Conv2D) (None, 32, 32, 32) 896 ['input_1[0][0]']
batch_normalization (BatchNorm (None, 32, 32, 32) 128 ['conv2d_4[0][0]']
alization)
activation (Activation) (None, 32, 32, 32) 0 ['batch_normalization[0][0]']
conv2d_5 (Conv2D) (None, 32, 32, 32) 9248 ['activation[0][0]']
tf.compat.v1.pad (TFOpLambda) (None, 32, 32, 32) 0 ['input_1[0][0]']
batch_normalization_1 (BatchNo (None, 32, 32, 32) 128 ['conv2d_5[0][0]']
rmalization)
add (Add) (None, 32, 32, 32) 0 ['tf.compat.v1.pad[0][0]',
'batch_normalization_1[0][0]']
activation_1 (Activation) (None, 32, 32, 32) 0 ['add[0][0]']
average_pooling2d_2 (AveragePo (None, 16, 16, 32) 0 ['activation_1[0][0]']
oling2D)
conv2d_6 (Conv2D) (None, 16, 16, 64) 18496 ['average_pooling2d_2[0][0]']
batch_normalization_2 (BatchNo (None, 16, 16, 64) 256 ['conv2d_6[0][0]']
rmalization)
activation_2 (Activation) (None, 16, 16, 64) 0 ['batch_normalization_2[0][0]']
conv2d_7 (Conv2D) (None, 16, 16, 64) 36928 ['activation_2[0][0]']
tf.compat.v1.pad_1 (TFOpLambda (None, 16, 16, 64) 0 ['average_pooling2d_2[0][0]']
)
batch_normalization_3 (BatchNo (None, 16, 16, 64) 256 ['conv2d_7[0][0]']
rmalization)
add_1 (Add) (None, 16, 16, 64) 0 ['tf.compat.v1.pad_1[0][0]',
'batch_normalization_3[0][0]']
activation_3 (Activation) (None, 16, 16, 64) 0 ['add_1[0][0]']
average_pooling2d_3 (AveragePo (None, 8, 8, 64) 0 ['activation_3[0][0]']
oling2D)
global_average_pooling2d (Glob (None, 64) 0 ['average_pooling2d_3[0][0]']
alAveragePooling2D)
flatten_1 (Flatten) (None, 64) 0 ['global_average_pooling2d[0][0]'
]
dense_4 (Dense) (None, 20) 1300 ['flatten_1[0][0]']
==================================================================================================
Total params: 67,636
Trainable params: 67,252
Non-trainable params: 384
__________________________________________________________________________________________________
None
Epoch 1/20
473/473 [==============================] - 9s 16ms/step - loss: 2.5454 - accuracy: 0.2274 - val_loss: 2.4879 - val_accuracy: 0.2357
Epoch 2/20
473/473 [==============================] - 7s 15ms/step - loss: 2.2405 - accuracy: 0.3165 - val_loss: 2.2155 - val_accuracy: 0.3087
Epoch 3/20
473/473 [==============================] - 7s 15ms/step - loss: 2.0808 - accuracy: 0.3583 - val_loss: 2.4101 - val_accuracy: 0.2558
Epoch 4/20
473/473 [==============================] - 7s 15ms/step - loss: 1.9518 - accuracy: 0.3942 - val_loss: 1.9236 - val_accuracy: 0.3932
Epoch 5/20
473/473 [==============================] - 7s 15ms/step - loss: 1.8321 - accuracy: 0.4283 - val_loss: 1.7975 - val_accuracy: 0.4197
Epoch 6/20
473/473 [==============================] - 7s 15ms/step - loss: 1.7244 - accuracy: 0.4655 - val_loss: 1.7212 - val_accuracy: 0.4641
Epoch 7/20
473/473 [==============================] - 7s 16ms/step - loss: 1.6426 - accuracy: 0.4907 - val_loss: 1.8694 - val_accuracy: 0.4123
Epoch 8/20
473/473 [==============================] - 8s 16ms/step - loss: 1.5542 - accuracy: 0.5144 - val_loss: 1.5336 - val_accuracy: 0.5402
Epoch 9/20
473/473 [==============================] - 8s 16ms/step - loss: 1.4966 - accuracy: 0.5282 - val_loss: 1.7946 - val_accuracy: 0.4450
Epoch 10/20
473/473 [==============================] - 7s 15ms/step - loss: 1.4419 - accuracy: 0.5516 - val_loss: 2.1555 - val_accuracy: 0.3552
Epoch 11/20
473/473 [==============================] - 7s 15ms/step - loss: 1.3794 - accuracy: 0.5698 - val_loss: 1.5656 - val_accuracy: 0.5116
Epoch 12/20
473/473 [==============================] - 7s 15ms/step - loss: 1.3423 - accuracy: 0.5742 - val_loss: 1.4894 - val_accuracy: 0.5349
Epoch 13/20
473/473 [==============================] - 7s 15ms/step - loss: 1.2964 - accuracy: 0.5955 - val_loss: 1.5869 - val_accuracy: 0.4979
Epoch 14/20
473/473 [==============================] - 7s 16ms/step - loss: 1.2537 - accuracy: 0.6050 - val_loss: 1.4339 - val_accuracy: 0.5507
Epoch 15/20
473/473 [==============================] - 7s 16ms/step - loss: 1.2226 - accuracy: 0.6147 - val_loss: 1.4221 - val_accuracy: 0.5412
Epoch 16/20
473/473 [==============================] - 7s 16ms/step - loss: 1.1923 - accuracy: 0.6282 - val_loss: 1.4886 - val_accuracy: 0.5370
Epoch 17/20
473/473 [==============================] - 7s 16ms/step - loss: 1.1622 - accuracy: 0.6349 - val_loss: 1.3364 - val_accuracy: 0.5920
Epoch 18/20
473/473 [==============================] - 8s 16ms/step - loss: 1.1086 - accuracy: 0.6495 - val_loss: 1.3345 - val_accuracy: 0.5867
Epoch 19/20
473/473 [==============================] - 8s 16ms/step - loss: 1.0759 - accuracy: 0.6632 - val_loss: 1.3187 - val_accuracy: 0.5888
Epoch 20/20
473/473 [==============================] - 8s 16ms/step - loss: 1.0528 - accuracy: 0.6675 - val_loss: 1.4775 - val_accuracy: 0.5772
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
15/15 [==============================] - 0s 17ms/step - loss: 1.5110 - accuracy: 0.5560
30/30 [==============================] - 0s 7ms/step
<Figure size 640x480 with 0 Axes>
#Save baseline model
testModel.model.save('models/base_dnn.h5')
**Question 3.2** Now, let us tune the number of blocks $num\_blocks \in \{3,4\}$, $use\_skip \in \{True, False\}$, $mean\_pool \in \{True, False\}$, and $learning\_rate \in \{0.001, 0.0001\}$. Write your code for this tuning and report the result of the best model on the testing set. Note that you need to show your code for tuning and evaluating on the test set to earn the full marks. During tuning, you can set the instance variable verbose of your model to False for not showing the training details of each epoch.
Best Config:
| Blocks | Skip | Pool | Rate | Accuracy |
|---|---|---|---|---|
| 4 | True | True | 0.001 | 58.77% |
After testing all configurations, we can conclude that this is the best set of instance variables and will now be used for all subsequent questions
#Insert your code here. You can add more cells if necessary
num_blocks = [3,4]
use_skip = [True,False]
mean_pool = [True,False]
learning_rate = [0.001,0.0001]
best_accuracy = -float('inf')
bestModel = None
# num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose
for blocks in num_blocks:
for skip in use_skip:
for pool in mean_pool:
for rate in learning_rate:
testModel = YourModel(16,blocks,pool,True,skip,rate,True)
testModel.build_resnet()
testModel.fit(data_manager, batch_size = 64, num_epochs = 30)
acc = testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
if acc > best_accuracy:
best_model = testModel
best_accuracy = acc
best_config = (blocks,skip,pool,rate,acc)
best_model.model.save('models/best_base_dnn_config.h5')
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32) (None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64) Epoch 1/30 119/119 [==============================] - 13s 105ms/step - loss: 2.4923 - accuracy: 0.2440 - val_loss: 3.0471 - val_accuracy: 0.1025 Epoch 2/30 119/119 [==============================] - 12s 101ms/step - loss: 2.1277 - accuracy: 0.3467 - val_loss: 3.2528 - val_accuracy: 0.1279 Epoch 3/30 119/119 [==============================] - 12s 98ms/step - loss: 1.9461 - accuracy: 0.4070 - val_loss: 2.8097 - val_accuracy: 0.1977 Epoch 4/30 119/119 [==============================] - 12s 99ms/step - loss: 1.8313 - accuracy: 0.4411 - val_loss: 2.1317 - val_accuracy: 0.3414 Epoch 5/30 119/119 [==============================] - 12s 100ms/step - loss: 1.7085 - accuracy: 0.4763 - val_loss: 1.9802 - val_accuracy: 0.4038 Epoch 6/30 119/119 [==============================] - 12s 101ms/step - loss: 1.6147 - accuracy: 0.5029 - val_loss: 1.8196 - val_accuracy: 0.4366 Epoch 7/30 119/119 [==============================] - 12s 99ms/step - loss: 1.5251 - accuracy: 0.5347 - val_loss: 1.9529 - val_accuracy: 0.4186 Epoch 8/30 119/119 [==============================] - 12s 102ms/step - loss: 1.4721 - accuracy: 0.5446 - val_loss: 1.8760 - val_accuracy: 0.4207 Epoch 9/30 119/119 [==============================] - 12s 101ms/step - loss: 1.3922 - accuracy: 0.5731 - val_loss: 1.7467 - val_accuracy: 0.4609 Epoch 10/30 119/119 [==============================] - 12s 99ms/step - loss: 1.3194 - accuracy: 0.5939 - val_loss: 1.6478 - val_accuracy: 0.4799 Epoch 11/30 119/119 [==============================] - 13s 108ms/step - loss: 1.2437 - accuracy: 0.6148 - val_loss: 1.7037 - val_accuracy: 0.4820 Epoch 12/30 119/119 [==============================] - 12s 103ms/step - loss: 1.2083 - accuracy: 0.6239 - val_loss: 1.7971 - val_accuracy: 0.4609 Epoch 13/30 119/119 [==============================] - 13s 111ms/step - loss: 1.1413 - accuracy: 0.6431 - val_loss: 1.6978 - val_accuracy: 0.4736 Epoch 14/30 119/119 [==============================] - 12s 102ms/step - loss: 1.1088 - accuracy: 0.6541 - val_loss: 1.7140 - val_accuracy: 0.4704 Epoch 15/30 119/119 [==============================] - 12s 100ms/step - loss: 1.0597 - accuracy: 0.6724 - val_loss: 1.7007 - val_accuracy: 0.5053 Epoch 16/30 119/119 [==============================] - 12s 100ms/step - loss: 1.0065 - accuracy: 0.6817 - val_loss: 2.1691 - val_accuracy: 0.3827 Epoch 17/30 119/119 [==============================] - 12s 101ms/step - loss: 0.9765 - accuracy: 0.6948 - val_loss: 1.6123 - val_accuracy: 0.5275 Epoch 18/30 119/119 [==============================] - 12s 102ms/step - loss: 0.9267 - accuracy: 0.7132 - val_loss: 1.5606 - val_accuracy: 0.5254 Epoch 19/30 119/119 [==============================] - 12s 102ms/step - loss: 0.8760 - accuracy: 0.7284 - val_loss: 1.6391 - val_accuracy: 0.4884 Epoch 20/30 119/119 [==============================] - 12s 101ms/step - loss: 0.8644 - accuracy: 0.7283 - val_loss: 1.6272 - val_accuracy: 0.5328 Epoch 21/30 119/119 [==============================] - 12s 102ms/step - loss: 0.8107 - accuracy: 0.7475 - val_loss: 1.7318 - val_accuracy: 0.5106 Epoch 22/30 119/119 [==============================] - 12s 101ms/step - loss: 0.7856 - accuracy: 0.7574 - val_loss: 1.6059 - val_accuracy: 0.4968 Epoch 23/30 119/119 [==============================] - 12s 102ms/step - loss: 0.7273 - accuracy: 0.7800 - val_loss: 1.6060 - val_accuracy: 0.5391 Epoch 24/30 119/119 [==============================] - 12s 101ms/step - loss: 0.6959 - accuracy: 0.7876 - val_loss: 1.9895 - val_accuracy: 0.4683 Epoch 25/30 119/119 [==============================] - 12s 102ms/step - loss: 0.6657 - accuracy: 0.7954 - val_loss: 2.0533 - val_accuracy: 0.4545 Epoch 26/30 119/119 [==============================] - 12s 104ms/step - loss: 0.6310 - accuracy: 0.8049 - val_loss: 1.6096 - val_accuracy: 0.5307 Epoch 27/30 119/119 [==============================] - 12s 102ms/step - loss: 0.6205 - accuracy: 0.8063 - val_loss: 1.7208 - val_accuracy: 0.5021 Epoch 28/30 119/119 [==============================] - 13s 106ms/step - loss: 0.5879 - accuracy: 0.8222 - val_loss: 1.6549 - val_accuracy: 0.5359 Epoch 29/30 119/119 [==============================] - 15s 127ms/step - loss: 0.5777 - accuracy: 0.8200 - val_loss: 1.7144 - val_accuracy: 0.5539 Epoch 30/30 119/119 [==============================] - 12s 102ms/step - loss: 0.5184 - accuracy: 0.8430 - val_loss: 1.7804 - val_accuracy: 0.5518 15/15 [==============================] - 1s 35ms/step - loss: 1.6913 - accuracy: 0.5465
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64) Epoch 1/30 119/119 [==============================] - 15s 124ms/step - loss: 2.9280 - accuracy: 0.1108 - val_loss: 2.9893 - val_accuracy: 0.0772 Epoch 2/30 119/119 [==============================] - 14s 114ms/step - loss: 2.6646 - accuracy: 0.2005 - val_loss: 2.9263 - val_accuracy: 0.0899 Epoch 3/30 119/119 [==============================] - 12s 103ms/step - loss: 2.5306 - accuracy: 0.2574 - val_loss: 2.6735 - val_accuracy: 0.1660 Epoch 4/30 119/119 [==============================] - 12s 101ms/step - loss: 2.4293 - accuracy: 0.2880 - val_loss: 2.4811 - val_accuracy: 0.2273 Epoch 5/30 119/119 [==============================] - 12s 100ms/step - loss: 2.3456 - accuracy: 0.3094 - val_loss: 2.3525 - val_accuracy: 0.2738 Epoch 6/30 119/119 [==============================] - 12s 100ms/step - loss: 2.2772 - accuracy: 0.3299 - val_loss: 2.2804 - val_accuracy: 0.2907 Epoch 7/30 119/119 [==============================] - 12s 100ms/step - loss: 2.2209 - accuracy: 0.3380 - val_loss: 2.2241 - val_accuracy: 0.3044 Epoch 8/30 119/119 [==============================] - 12s 102ms/step - loss: 2.1652 - accuracy: 0.3552 - val_loss: 2.2030 - val_accuracy: 0.3034 Epoch 9/30 119/119 [==============================] - 13s 108ms/step - loss: 2.1184 - accuracy: 0.3657 - val_loss: 2.1242 - val_accuracy: 0.3520 Epoch 10/30 119/119 [==============================] - 12s 103ms/step - loss: 2.0811 - accuracy: 0.3759 - val_loss: 2.1016 - val_accuracy: 0.3689 Epoch 11/30 119/119 [==============================] - 12s 102ms/step - loss: 2.0393 - accuracy: 0.3956 - val_loss: 2.0799 - val_accuracy: 0.3679 Epoch 12/30 119/119 [==============================] - 12s 103ms/step - loss: 2.0088 - accuracy: 0.3985 - val_loss: 2.0477 - val_accuracy: 0.3668 Epoch 13/30 119/119 [==============================] - 12s 102ms/step - loss: 1.9777 - accuracy: 0.4102 - val_loss: 2.0073 - val_accuracy: 0.3932 Epoch 14/30 119/119 [==============================] - 12s 104ms/step - loss: 1.9482 - accuracy: 0.4188 - val_loss: 2.0031 - val_accuracy: 0.3901 Epoch 15/30 119/119 [==============================] - 12s 102ms/step - loss: 1.9236 - accuracy: 0.4222 - val_loss: 1.9815 - val_accuracy: 0.3975 Epoch 16/30 119/119 [==============================] - 12s 102ms/step - loss: 1.8935 - accuracy: 0.4388 - val_loss: 1.9787 - val_accuracy: 0.4006 Epoch 17/30 119/119 [==============================] - 12s 101ms/step - loss: 1.8654 - accuracy: 0.4388 - val_loss: 1.9359 - val_accuracy: 0.3901 Epoch 18/30 119/119 [==============================] - 13s 108ms/step - loss: 1.8502 - accuracy: 0.4496 - val_loss: 1.9298 - val_accuracy: 0.4027 Epoch 19/30 119/119 [==============================] - 12s 105ms/step - loss: 1.8193 - accuracy: 0.4563 - val_loss: 1.9242 - val_accuracy: 0.4133 Epoch 20/30 119/119 [==============================] - 12s 102ms/step - loss: 1.7988 - accuracy: 0.4578 - val_loss: 1.9184 - val_accuracy: 0.4123 Epoch 21/30 119/119 [==============================] - 12s 100ms/step - loss: 1.7751 - accuracy: 0.4689 - val_loss: 1.8955 - val_accuracy: 0.4207 Epoch 22/30 119/119 [==============================] - 12s 100ms/step - loss: 1.7590 - accuracy: 0.4742 - val_loss: 1.9157 - val_accuracy: 0.4112 Epoch 23/30 119/119 [==============================] - 12s 102ms/step - loss: 1.7343 - accuracy: 0.4803 - val_loss: 1.8682 - val_accuracy: 0.4260 Epoch 24/30 119/119 [==============================] - 12s 103ms/step - loss: 1.7198 - accuracy: 0.4889 - val_loss: 1.8609 - val_accuracy: 0.4323 Epoch 25/30 119/119 [==============================] - 12s 101ms/step - loss: 1.6944 - accuracy: 0.4956 - val_loss: 1.8286 - val_accuracy: 0.4313 Epoch 26/30 119/119 [==============================] - 12s 100ms/step - loss: 1.6771 - accuracy: 0.4999 - val_loss: 1.8198 - val_accuracy: 0.4419 Epoch 27/30 119/119 [==============================] - 12s 102ms/step - loss: 1.6619 - accuracy: 0.4992 - val_loss: 1.8172 - val_accuracy: 0.4429 Epoch 28/30 119/119 [==============================] - 12s 101ms/step - loss: 1.6442 - accuracy: 0.5086 - val_loss: 1.8119 - val_accuracy: 0.4387 Epoch 29/30 119/119 [==============================] - 12s 102ms/step - loss: 1.6212 - accuracy: 0.5103 - val_loss: 1.7779 - val_accuracy: 0.4493 Epoch 30/30 119/119 [==============================] - 12s 99ms/step - loss: 1.6093 - accuracy: 0.5168 - val_loss: 1.8065 - val_accuracy: 0.4387 15/15 [==============================] - 1s 33ms/step - loss: 1.7299 - accuracy: 0.4704
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64) Epoch 1/30 119/119 [==============================] - 45s 377ms/step - loss: 2.5993 - accuracy: 0.2206 - val_loss: 3.0598 - val_accuracy: 0.0835 Epoch 2/30 119/119 [==============================] - 45s 374ms/step - loss: 2.3132 - accuracy: 0.2948 - val_loss: 2.9004 - val_accuracy: 0.1332 Epoch 3/30 119/119 [==============================] - 31s 256ms/step - loss: 2.1634 - accuracy: 0.3406 - val_loss: 2.5185 - val_accuracy: 0.2082 Epoch 4/30 119/119 [==============================] - 29s 244ms/step - loss: 2.0424 - accuracy: 0.3853 - val_loss: 2.3367 - val_accuracy: 0.2865 Epoch 5/30 119/119 [==============================] - 29s 247ms/step - loss: 1.9609 - accuracy: 0.4034 - val_loss: 2.2986 - val_accuracy: 0.3214 Epoch 6/30 119/119 [==============================] - 31s 260ms/step - loss: 1.8778 - accuracy: 0.4247 - val_loss: 2.6598 - val_accuracy: 0.2569 Epoch 7/30 119/119 [==============================] - 32s 265ms/step - loss: 1.8133 - accuracy: 0.4439 - val_loss: 2.0327 - val_accuracy: 0.3658 Epoch 8/30 119/119 [==============================] - 33s 276ms/step - loss: 1.7591 - accuracy: 0.4550 - val_loss: 2.1733 - val_accuracy: 0.3478 Epoch 9/30 119/119 [==============================] - 30s 252ms/step - loss: 1.6747 - accuracy: 0.4794 - val_loss: 2.1670 - val_accuracy: 0.3573 Epoch 10/30 119/119 [==============================] - 30s 253ms/step - loss: 1.6190 - accuracy: 0.4995 - val_loss: 1.9513 - val_accuracy: 0.3964 Epoch 11/30 119/119 [==============================] - 30s 254ms/step - loss: 1.5573 - accuracy: 0.5228 - val_loss: 1.8114 - val_accuracy: 0.4461 Epoch 12/30 119/119 [==============================] - 30s 255ms/step - loss: 1.4940 - accuracy: 0.5333 - val_loss: 1.8698 - val_accuracy: 0.4313 Epoch 13/30 119/119 [==============================] - 30s 255ms/step - loss: 1.4651 - accuracy: 0.5507 - val_loss: 2.1673 - val_accuracy: 0.3552 Epoch 14/30 119/119 [==============================] - 31s 257ms/step - loss: 1.4159 - accuracy: 0.5569 - val_loss: 1.9580 - val_accuracy: 0.4260 Epoch 15/30 119/119 [==============================] - 31s 261ms/step - loss: 1.3878 - accuracy: 0.5684 - val_loss: 1.7248 - val_accuracy: 0.4630 Epoch 16/30 119/119 [==============================] - 31s 257ms/step - loss: 1.3594 - accuracy: 0.5738 - val_loss: 2.0013 - val_accuracy: 0.3858 Epoch 17/30 119/119 [==============================] - 31s 260ms/step - loss: 1.3107 - accuracy: 0.5922 - val_loss: 1.7925 - val_accuracy: 0.4567 Epoch 18/30 119/119 [==============================] - 31s 260ms/step - loss: 1.2900 - accuracy: 0.5950 - val_loss: 1.7052 - val_accuracy: 0.4778 Epoch 19/30 119/119 [==============================] - 31s 257ms/step - loss: 1.2597 - accuracy: 0.6093 - val_loss: 1.7073 - val_accuracy: 0.4609 Epoch 20/30 119/119 [==============================] - 31s 258ms/step - loss: 1.2434 - accuracy: 0.6165 - val_loss: 1.8809 - val_accuracy: 0.4588 Epoch 21/30 119/119 [==============================] - 31s 257ms/step - loss: 1.1844 - accuracy: 0.6290 - val_loss: 1.9237 - val_accuracy: 0.4662 Epoch 22/30 119/119 [==============================] - 31s 259ms/step - loss: 1.1665 - accuracy: 0.6351 - val_loss: 1.5667 - val_accuracy: 0.5021 Epoch 23/30 119/119 [==============================] - 31s 258ms/step - loss: 1.1333 - accuracy: 0.6464 - val_loss: 1.7319 - val_accuracy: 0.4884 Epoch 24/30 119/119 [==============================] - 31s 259ms/step - loss: 1.1148 - accuracy: 0.6521 - val_loss: 1.7901 - val_accuracy: 0.4725 Epoch 25/30 119/119 [==============================] - 32s 271ms/step - loss: 1.0952 - accuracy: 0.6693 - val_loss: 1.9127 - val_accuracy: 0.4355 Epoch 26/30 119/119 [==============================] - 33s 276ms/step - loss: 1.0709 - accuracy: 0.6661 - val_loss: 1.9725 - val_accuracy: 0.4313 Epoch 27/30 119/119 [==============================] - 31s 261ms/step - loss: 1.0568 - accuracy: 0.6758 - val_loss: 1.6910 - val_accuracy: 0.5137 Epoch 28/30 119/119 [==============================] - 31s 259ms/step - loss: 1.0248 - accuracy: 0.6841 - val_loss: 1.8256 - val_accuracy: 0.4789 Epoch 29/30 119/119 [==============================] - 31s 259ms/step - loss: 1.0102 - accuracy: 0.6840 - val_loss: 1.9483 - val_accuracy: 0.4535 Epoch 30/30 119/119 [==============================] - 31s 260ms/step - loss: 0.9965 - accuracy: 0.6919 - val_loss: 1.6930 - val_accuracy: 0.5095 15/15 [==============================] - 1s 76ms/step - loss: 1.5292 - accuracy: 0.5645
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64) Epoch 1/30 119/119 [==============================] - 31s 258ms/step - loss: 3.0070 - accuracy: 0.1263 - val_loss: 2.9836 - val_accuracy: 0.0655 Epoch 2/30 119/119 [==============================] - 31s 260ms/step - loss: 2.7050 - accuracy: 0.2094 - val_loss: 2.8827 - val_accuracy: 0.1057 Epoch 3/30 119/119 [==============================] - 31s 258ms/step - loss: 2.6049 - accuracy: 0.2329 - val_loss: 2.6769 - val_accuracy: 0.1723 Epoch 4/30 119/119 [==============================] - 31s 260ms/step - loss: 2.5347 - accuracy: 0.2556 - val_loss: 2.5364 - val_accuracy: 0.2336 Epoch 5/30 119/119 [==============================] - 31s 257ms/step - loss: 2.4813 - accuracy: 0.2680 - val_loss: 2.4768 - val_accuracy: 0.2611 Epoch 6/30 119/119 [==============================] - 31s 259ms/step - loss: 2.4375 - accuracy: 0.2817 - val_loss: 2.4490 - val_accuracy: 0.2791 Epoch 7/30 119/119 [==============================] - 31s 258ms/step - loss: 2.4003 - accuracy: 0.2901 - val_loss: 2.3748 - val_accuracy: 0.2960 Epoch 8/30 119/119 [==============================] - 31s 264ms/step - loss: 2.3609 - accuracy: 0.3024 - val_loss: 2.3632 - val_accuracy: 0.2939 Epoch 9/30 119/119 [==============================] - 31s 259ms/step - loss: 2.3312 - accuracy: 0.3104 - val_loss: 2.3478 - val_accuracy: 0.2896 Epoch 10/30 119/119 [==============================] - 31s 259ms/step - loss: 2.3063 - accuracy: 0.3188 - val_loss: 2.3062 - val_accuracy: 0.2981 Epoch 11/30 119/119 [==============================] - 31s 259ms/step - loss: 2.2751 - accuracy: 0.3300 - val_loss: 2.2863 - val_accuracy: 0.3266 Epoch 12/30 119/119 [==============================] - 31s 259ms/step - loss: 2.2487 - accuracy: 0.3312 - val_loss: 2.2497 - val_accuracy: 0.3214 Epoch 13/30 119/119 [==============================] - 31s 260ms/step - loss: 2.2286 - accuracy: 0.3397 - val_loss: 2.2308 - val_accuracy: 0.3192 Epoch 14/30 119/119 [==============================] - 31s 262ms/step - loss: 2.2003 - accuracy: 0.3454 - val_loss: 2.1979 - val_accuracy: 0.3414 Epoch 15/30 119/119 [==============================] - 31s 259ms/step - loss: 2.1801 - accuracy: 0.3475 - val_loss: 2.2015 - val_accuracy: 0.3414 Epoch 16/30 119/119 [==============================] - 31s 259ms/step - loss: 2.1575 - accuracy: 0.3585 - val_loss: 2.1855 - val_accuracy: 0.3541 Epoch 17/30 119/119 [==============================] - 31s 260ms/step - loss: 2.1363 - accuracy: 0.3672 - val_loss: 2.1712 - val_accuracy: 0.3372 Epoch 18/30 119/119 [==============================] - 31s 259ms/step - loss: 2.1228 - accuracy: 0.3754 - val_loss: 2.1378 - val_accuracy: 0.3626 Epoch 19/30 119/119 [==============================] - 31s 260ms/step - loss: 2.0974 - accuracy: 0.3763 - val_loss: 2.1328 - val_accuracy: 0.3827 Epoch 20/30 119/119 [==============================] - 31s 263ms/step - loss: 2.0794 - accuracy: 0.3798 - val_loss: 2.1067 - val_accuracy: 0.3763 Epoch 21/30 119/119 [==============================] - 32s 265ms/step - loss: 2.0613 - accuracy: 0.3870 - val_loss: 2.0842 - val_accuracy: 0.3890 Epoch 22/30 119/119 [==============================] - 31s 260ms/step - loss: 2.0448 - accuracy: 0.3910 - val_loss: 2.0888 - val_accuracy: 0.3816 Epoch 23/30 119/119 [==============================] - 31s 264ms/step - loss: 2.0293 - accuracy: 0.3967 - val_loss: 2.0664 - val_accuracy: 0.3869 Epoch 24/30 119/119 [==============================] - 31s 259ms/step - loss: 2.0159 - accuracy: 0.3988 - val_loss: 2.0503 - val_accuracy: 0.4080 Epoch 25/30 119/119 [==============================] - 31s 264ms/step - loss: 1.9938 - accuracy: 0.4040 - val_loss: 2.0377 - val_accuracy: 0.4080 Epoch 26/30 119/119 [==============================] - 31s 260ms/step - loss: 1.9779 - accuracy: 0.4122 - val_loss: 2.0320 - val_accuracy: 0.4059 Epoch 27/30 119/119 [==============================] - 31s 259ms/step - loss: 1.9679 - accuracy: 0.4085 - val_loss: 2.0045 - val_accuracy: 0.4260 Epoch 28/30 119/119 [==============================] - 31s 258ms/step - loss: 1.9531 - accuracy: 0.4188 - val_loss: 1.9911 - val_accuracy: 0.4027 Epoch 29/30 119/119 [==============================] - 31s 260ms/step - loss: 1.9373 - accuracy: 0.4138 - val_loss: 1.9827 - val_accuracy: 0.4313 Epoch 30/30 119/119 [==============================] - 31s 258ms/step - loss: 1.9282 - accuracy: 0.4214 - val_loss: 1.9677 - val_accuracy: 0.4281 15/15 [==============================] - 1s 74ms/step - loss: 1.9307 - accuracy: 0.4207
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32) (None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64) Epoch 1/30 119/119 [==============================] - 9s 73ms/step - loss: 2.5564 - accuracy: 0.2312 - val_loss: 3.0646 - val_accuracy: 0.1004 Epoch 2/30 119/119 [==============================] - 9s 72ms/step - loss: 2.1750 - accuracy: 0.3421 - val_loss: 3.2105 - val_accuracy: 0.1628 Epoch 3/30 119/119 [==============================] - 9s 74ms/step - loss: 1.9692 - accuracy: 0.3984 - val_loss: 2.6672 - val_accuracy: 0.2040 Epoch 4/30 119/119 [==============================] - 8s 71ms/step - loss: 1.8273 - accuracy: 0.4467 - val_loss: 2.2065 - val_accuracy: 0.3362 Epoch 5/30 119/119 [==============================] - 8s 70ms/step - loss: 1.7050 - accuracy: 0.4803 - val_loss: 2.5577 - val_accuracy: 0.2907 Epoch 6/30 119/119 [==============================] - 8s 70ms/step - loss: 1.6039 - accuracy: 0.5124 - val_loss: 1.8100 - val_accuracy: 0.4355 Epoch 7/30 119/119 [==============================] - 9s 72ms/step - loss: 1.5307 - accuracy: 0.5292 - val_loss: 1.7782 - val_accuracy: 0.4577 Epoch 8/30 119/119 [==============================] - 8s 71ms/step - loss: 1.4548 - accuracy: 0.5562 - val_loss: 1.7533 - val_accuracy: 0.4471 Epoch 9/30 119/119 [==============================] - 9s 71ms/step - loss: 1.3723 - accuracy: 0.5746 - val_loss: 1.8553 - val_accuracy: 0.4567 Epoch 10/30 119/119 [==============================] - 8s 71ms/step - loss: 1.2927 - accuracy: 0.6041 - val_loss: 1.8957 - val_accuracy: 0.4165 Epoch 11/30 119/119 [==============================] - 8s 71ms/step - loss: 1.2390 - accuracy: 0.6157 - val_loss: 1.8694 - val_accuracy: 0.4419 Epoch 12/30 119/119 [==============================] - 8s 71ms/step - loss: 1.1868 - accuracy: 0.6366 - val_loss: 1.6508 - val_accuracy: 0.5011 Epoch 13/30 119/119 [==============================] - 8s 71ms/step - loss: 1.1314 - accuracy: 0.6519 - val_loss: 1.9929 - val_accuracy: 0.4281 Epoch 14/30 119/119 [==============================] - 9s 72ms/step - loss: 1.1002 - accuracy: 0.6636 - val_loss: 1.7336 - val_accuracy: 0.4894 Epoch 15/30 119/119 [==============================] - 9s 71ms/step - loss: 1.0398 - accuracy: 0.6784 - val_loss: 1.6048 - val_accuracy: 0.5148 Epoch 16/30 119/119 [==============================] - 8s 70ms/step - loss: 0.9964 - accuracy: 0.6911 - val_loss: 2.0252 - val_accuracy: 0.4070 Epoch 17/30 119/119 [==============================] - 8s 70ms/step - loss: 0.9685 - accuracy: 0.6989 - val_loss: 1.8448 - val_accuracy: 0.4746 Epoch 18/30 119/119 [==============================] - 8s 71ms/step - loss: 0.9244 - accuracy: 0.7114 - val_loss: 1.5801 - val_accuracy: 0.5127 Epoch 19/30 119/119 [==============================] - 8s 69ms/step - loss: 0.8682 - accuracy: 0.7286 - val_loss: 1.9696 - val_accuracy: 0.4514 Epoch 20/30 119/119 [==============================] - 8s 71ms/step - loss: 0.8400 - accuracy: 0.7388 - val_loss: 1.7320 - val_accuracy: 0.4958 Epoch 21/30 119/119 [==============================] - 8s 71ms/step - loss: 0.7814 - accuracy: 0.7618 - val_loss: 2.0780 - val_accuracy: 0.4757 Epoch 22/30 119/119 [==============================] - 8s 71ms/step - loss: 0.7626 - accuracy: 0.7601 - val_loss: 1.9845 - val_accuracy: 0.4672 Epoch 23/30 119/119 [==============================] - 8s 71ms/step - loss: 0.7273 - accuracy: 0.7739 - val_loss: 1.9220 - val_accuracy: 0.4989 Epoch 24/30 119/119 [==============================] - 9s 72ms/step - loss: 0.6824 - accuracy: 0.7897 - val_loss: 2.1136 - val_accuracy: 0.4429 Epoch 25/30 119/119 [==============================] - 8s 70ms/step - loss: 0.6467 - accuracy: 0.7993 - val_loss: 1.9164 - val_accuracy: 0.4545 Epoch 26/30 119/119 [==============================] - 9s 71ms/step - loss: 0.6146 - accuracy: 0.8127 - val_loss: 1.7934 - val_accuracy: 0.5063 Epoch 27/30 119/119 [==============================] - 8s 70ms/step - loss: 0.5848 - accuracy: 0.8218 - val_loss: 1.5915 - val_accuracy: 0.5307 Epoch 28/30 119/119 [==============================] - 9s 72ms/step - loss: 0.5579 - accuracy: 0.8304 - val_loss: 2.0371 - val_accuracy: 0.4757 Epoch 29/30 119/119 [==============================] - 8s 70ms/step - loss: 0.5395 - accuracy: 0.8385 - val_loss: 1.7424 - val_accuracy: 0.5127 Epoch 30/30 119/119 [==============================] - 8s 71ms/step - loss: 0.4997 - accuracy: 0.8483 - val_loss: 1.6464 - val_accuracy: 0.5550 15/15 [==============================] - 0s 23ms/step - loss: 1.5557 - accuracy: 0.5867
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64) Epoch 1/30 119/119 [==============================] - 8s 71ms/step - loss: 2.9814 - accuracy: 0.1140 - val_loss: 2.9667 - val_accuracy: 0.0835 Epoch 2/30 119/119 [==============================] - 8s 70ms/step - loss: 2.6891 - accuracy: 0.1938 - val_loss: 2.8303 - val_accuracy: 0.1427 Epoch 3/30 119/119 [==============================] - 8s 71ms/step - loss: 2.5423 - accuracy: 0.2521 - val_loss: 2.5894 - val_accuracy: 0.2156 Epoch 4/30 119/119 [==============================] - 8s 70ms/step - loss: 2.4431 - accuracy: 0.2844 - val_loss: 2.4466 - val_accuracy: 0.2632 Epoch 5/30 119/119 [==============================] - 9s 72ms/step - loss: 2.3616 - accuracy: 0.3021 - val_loss: 2.3605 - val_accuracy: 0.2981 Epoch 6/30 119/119 [==============================] - 8s 69ms/step - loss: 2.2964 - accuracy: 0.3249 - val_loss: 2.3237 - val_accuracy: 0.2949 Epoch 7/30 119/119 [==============================] - 9s 72ms/step - loss: 2.2429 - accuracy: 0.3300 - val_loss: 2.2628 - val_accuracy: 0.3161 Epoch 8/30 119/119 [==============================] - 8s 71ms/step - loss: 2.1861 - accuracy: 0.3504 - val_loss: 2.2352 - val_accuracy: 0.3288 Epoch 9/30 119/119 [==============================] - 8s 71ms/step - loss: 2.1428 - accuracy: 0.3598 - val_loss: 2.2067 - val_accuracy: 0.3277 Epoch 10/30 119/119 [==============================] - 8s 70ms/step - loss: 2.1051 - accuracy: 0.3685 - val_loss: 2.1687 - val_accuracy: 0.3383 Epoch 11/30 119/119 [==============================] - 8s 71ms/step - loss: 2.0656 - accuracy: 0.3798 - val_loss: 2.1352 - val_accuracy: 0.3541 Epoch 12/30 119/119 [==============================] - 9s 73ms/step - loss: 2.0328 - accuracy: 0.3882 - val_loss: 2.1101 - val_accuracy: 0.3510 Epoch 13/30 119/119 [==============================] - 8s 71ms/step - loss: 2.0032 - accuracy: 0.3987 - val_loss: 2.0832 - val_accuracy: 0.3700 Epoch 14/30 119/119 [==============================] - 8s 70ms/step - loss: 1.9717 - accuracy: 0.4060 - val_loss: 2.0381 - val_accuracy: 0.3763 Epoch 15/30 119/119 [==============================] - 9s 71ms/step - loss: 1.9439 - accuracy: 0.4135 - val_loss: 2.0286 - val_accuracy: 0.3922 Epoch 16/30 119/119 [==============================] - 8s 71ms/step - loss: 1.9127 - accuracy: 0.4255 - val_loss: 2.0308 - val_accuracy: 0.3890 Epoch 17/30 119/119 [==============================] - 8s 71ms/step - loss: 1.8875 - accuracy: 0.4279 - val_loss: 1.9925 - val_accuracy: 0.3901 Epoch 18/30 119/119 [==============================] - 8s 71ms/step - loss: 1.8678 - accuracy: 0.4386 - val_loss: 1.9628 - val_accuracy: 0.4070 Epoch 19/30 119/119 [==============================] - 8s 71ms/step - loss: 1.8360 - accuracy: 0.4491 - val_loss: 1.9794 - val_accuracy: 0.4027 Epoch 20/30 119/119 [==============================] - 8s 71ms/step - loss: 1.8166 - accuracy: 0.4552 - val_loss: 1.9391 - val_accuracy: 0.4144 Epoch 21/30 119/119 [==============================] - 8s 71ms/step - loss: 1.7931 - accuracy: 0.4636 - val_loss: 1.9311 - val_accuracy: 0.4017 Epoch 22/30 119/119 [==============================] - 8s 71ms/step - loss: 1.7764 - accuracy: 0.4684 - val_loss: 1.9246 - val_accuracy: 0.4154 Epoch 23/30 119/119 [==============================] - 8s 71ms/step - loss: 1.7477 - accuracy: 0.4742 - val_loss: 1.9544 - val_accuracy: 0.4165 Epoch 24/30 119/119 [==============================] - 8s 71ms/step - loss: 1.7375 - accuracy: 0.4741 - val_loss: 1.9060 - val_accuracy: 0.4154 Epoch 25/30 119/119 [==============================] - 8s 70ms/step - loss: 1.7108 - accuracy: 0.4839 - val_loss: 1.8796 - val_accuracy: 0.4271 Epoch 26/30 119/119 [==============================] - 9s 72ms/step - loss: 1.6926 - accuracy: 0.4896 - val_loss: 1.8658 - val_accuracy: 0.4376 Epoch 27/30 119/119 [==============================] - 8s 70ms/step - loss: 1.6742 - accuracy: 0.4938 - val_loss: 1.8440 - val_accuracy: 0.4249 Epoch 28/30 119/119 [==============================] - 9s 72ms/step - loss: 1.6631 - accuracy: 0.5016 - val_loss: 1.8284 - val_accuracy: 0.4355 Epoch 29/30 119/119 [==============================] - 8s 71ms/step - loss: 1.6380 - accuracy: 0.5108 - val_loss: 1.8169 - val_accuracy: 0.4450 Epoch 30/30 119/119 [==============================] - 9s 72ms/step - loss: 1.6253 - accuracy: 0.5134 - val_loss: 1.8194 - val_accuracy: 0.4514 15/15 [==============================] - 0s 22ms/step - loss: 1.7488 - accuracy: 0.4736
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
Epoch 1/30 119/119 [==============================] - 31s 257ms/step - loss: 2.6296 - accuracy: 0.2056 - val_loss: 2.8556 - val_accuracy: 0.0983 Epoch 2/30 119/119 [==============================] - 31s 259ms/step - loss: 2.3787 - accuracy: 0.2836 - val_loss: 2.6694 - val_accuracy: 0.1638 Epoch 3/30 119/119 [==============================] - 31s 258ms/step - loss: 2.2439 - accuracy: 0.3136 - val_loss: 2.3958 - val_accuracy: 0.2326 Epoch 4/30 119/119 [==============================] - 31s 259ms/step - loss: 2.1156 - accuracy: 0.3521 - val_loss: 2.1679 - val_accuracy: 0.3224 Epoch 5/30 119/119 [==============================] - 31s 257ms/step - loss: 2.0314 - accuracy: 0.3769 - val_loss: 2.1703 - val_accuracy: 0.3414 Epoch 6/30 119/119 [==============================] - 32s 265ms/step - loss: 1.9413 - accuracy: 0.4012 - val_loss: 2.3300 - val_accuracy: 0.2822 Epoch 7/30 119/119 [==============================] - 31s 260ms/step - loss: 1.8751 - accuracy: 0.4272 - val_loss: 2.0460 - val_accuracy: 0.3647 Epoch 8/30 119/119 [==============================] - 31s 261ms/step - loss: 1.8073 - accuracy: 0.4460 - val_loss: 2.1693 - val_accuracy: 0.3319 Epoch 9/30 119/119 [==============================] - 31s 259ms/step - loss: 1.7244 - accuracy: 0.4665 - val_loss: 2.0222 - val_accuracy: 0.3689 Epoch 10/30 119/119 [==============================] - 31s 261ms/step - loss: 1.6705 - accuracy: 0.4806 - val_loss: 1.9322 - val_accuracy: 0.3953 Epoch 11/30 119/119 [==============================] - 31s 260ms/step - loss: 1.6042 - accuracy: 0.5037 - val_loss: 2.4594 - val_accuracy: 0.3309 Epoch 12/30 119/119 [==============================] - 31s 261ms/step - loss: 1.5649 - accuracy: 0.5144 - val_loss: 1.9625 - val_accuracy: 0.3869 Epoch 13/30 119/119 [==============================] - 31s 261ms/step - loss: 1.5048 - accuracy: 0.5386 - val_loss: 2.0244 - val_accuracy: 0.4080 Epoch 14/30 119/119 [==============================] - 31s 262ms/step - loss: 1.4788 - accuracy: 0.5362 - val_loss: 1.8352 - val_accuracy: 0.4397 Epoch 15/30 119/119 [==============================] - 31s 261ms/step - loss: 1.4406 - accuracy: 0.5603 - val_loss: 1.8192 - val_accuracy: 0.4408 Epoch 16/30 119/119 [==============================] - 31s 261ms/step - loss: 1.4021 - accuracy: 0.5622 - val_loss: 2.0564 - val_accuracy: 0.3795 Epoch 17/30 119/119 [==============================] - 32s 265ms/step - loss: 1.3629 - accuracy: 0.5792 - val_loss: 1.6949 - val_accuracy: 0.4355 Epoch 18/30 119/119 [==============================] - 31s 263ms/step - loss: 1.3345 - accuracy: 0.5862 - val_loss: 1.6340 - val_accuracy: 0.4662 Epoch 19/30 119/119 [==============================] - 31s 262ms/step - loss: 1.2925 - accuracy: 0.6022 - val_loss: 1.7518 - val_accuracy: 0.4609 Epoch 20/30 119/119 [==============================] - 31s 261ms/step - loss: 1.2891 - accuracy: 0.6021 - val_loss: 1.5895 - val_accuracy: 0.5085 Epoch 21/30 119/119 [==============================] - 31s 262ms/step - loss: 1.2330 - accuracy: 0.6177 - val_loss: 1.6727 - val_accuracy: 0.4810 Epoch 22/30 119/119 [==============================] - 31s 260ms/step - loss: 1.2132 - accuracy: 0.6201 - val_loss: 1.7371 - val_accuracy: 0.4757 Epoch 23/30 119/119 [==============================] - 31s 261ms/step - loss: 1.1826 - accuracy: 0.6384 - val_loss: 1.6990 - val_accuracy: 0.4556 Epoch 24/30 119/119 [==============================] - 31s 260ms/step - loss: 1.1520 - accuracy: 0.6478 - val_loss: 1.7441 - val_accuracy: 0.4725 Epoch 25/30 119/119 [==============================] - 31s 265ms/step - loss: 1.1235 - accuracy: 0.6567 - val_loss: 1.7342 - val_accuracy: 0.4958 Epoch 26/30 119/119 [==============================] - 31s 262ms/step - loss: 1.1014 - accuracy: 0.6574 - val_loss: 2.0277 - val_accuracy: 0.4524 Epoch 27/30 119/119 [==============================] - 31s 262ms/step - loss: 1.0814 - accuracy: 0.6660 - val_loss: 1.6935 - val_accuracy: 0.4778 Epoch 28/30 119/119 [==============================] - 32s 267ms/step - loss: 1.0592 - accuracy: 0.6749 - val_loss: 1.8152 - val_accuracy: 0.4471 Epoch 29/30 119/119 [==============================] - 31s 261ms/step - loss: 1.0396 - accuracy: 0.6762 - val_loss: 1.6763 - val_accuracy: 0.5349 Epoch 30/30 119/119 [==============================] - 31s 261ms/step - loss: 1.0158 - accuracy: 0.6832 - val_loss: 2.1599 - val_accuracy: 0.3996 15/15 [==============================] - 1s 74ms/step - loss: 1.9348 - accuracy: 0.4419
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64) Epoch 1/30 119/119 [==============================] - 31s 262ms/step - loss: 2.8817 - accuracy: 0.1224 - val_loss: 2.9331 - val_accuracy: 0.1068 Epoch 2/30 119/119 [==============================] - 31s 261ms/step - loss: 2.6501 - accuracy: 0.2049 - val_loss: 2.7761 - val_accuracy: 0.1438 Epoch 3/30 119/119 [==============================] - 31s 264ms/step - loss: 2.5590 - accuracy: 0.2479 - val_loss: 2.5903 - val_accuracy: 0.1956 Epoch 4/30 119/119 [==============================] - 32s 267ms/step - loss: 2.4942 - accuracy: 0.2684 - val_loss: 2.4825 - val_accuracy: 0.2526 Epoch 5/30 119/119 [==============================] - 32s 264ms/step - loss: 2.4435 - accuracy: 0.2820 - val_loss: 2.4502 - val_accuracy: 0.2653 Epoch 6/30 119/119 [==============================] - 31s 264ms/step - loss: 2.4077 - accuracy: 0.2878 - val_loss: 2.4252 - val_accuracy: 0.2611 Epoch 7/30 119/119 [==============================] - 31s 262ms/step - loss: 2.3718 - accuracy: 0.3007 - val_loss: 2.3738 - val_accuracy: 0.2886 Epoch 8/30 119/119 [==============================] - 31s 263ms/step - loss: 2.3360 - accuracy: 0.3116 - val_loss: 2.3470 - val_accuracy: 0.2896 Epoch 9/30 119/119 [==============================] - 31s 261ms/step - loss: 2.3061 - accuracy: 0.3169 - val_loss: 2.3270 - val_accuracy: 0.2822 Epoch 10/30 119/119 [==============================] - 32s 269ms/step - loss: 2.2768 - accuracy: 0.3267 - val_loss: 2.2896 - val_accuracy: 0.3087 Epoch 11/30 119/119 [==============================] - 31s 261ms/step - loss: 2.2433 - accuracy: 0.3423 - val_loss: 2.2748 - val_accuracy: 0.3076 Epoch 12/30 119/119 [==============================] - 31s 264ms/step - loss: 2.2172 - accuracy: 0.3407 - val_loss: 2.2461 - val_accuracy: 0.3414 Epoch 13/30 119/119 [==============================] - 31s 265ms/step - loss: 2.1953 - accuracy: 0.3491 - val_loss: 2.2236 - val_accuracy: 0.3192 Epoch 14/30 119/119 [==============================] - 32s 266ms/step - loss: 2.1661 - accuracy: 0.3560 - val_loss: 2.1939 - val_accuracy: 0.3319 Epoch 15/30 119/119 [==============================] - 31s 261ms/step - loss: 2.1459 - accuracy: 0.3628 - val_loss: 2.1866 - val_accuracy: 0.3446 Epoch 16/30 119/119 [==============================] - 31s 262ms/step - loss: 2.1231 - accuracy: 0.3716 - val_loss: 2.1683 - val_accuracy: 0.3636 Epoch 17/30 119/119 [==============================] - 31s 262ms/step - loss: 2.1019 - accuracy: 0.3709 - val_loss: 2.1496 - val_accuracy: 0.3446 Epoch 18/30 119/119 [==============================] - 31s 262ms/step - loss: 2.0892 - accuracy: 0.3807 - val_loss: 2.1291 - val_accuracy: 0.3700 Epoch 19/30 119/119 [==============================] - 31s 262ms/step - loss: 2.0625 - accuracy: 0.3872 - val_loss: 2.1288 - val_accuracy: 0.3732 Epoch 20/30 119/119 [==============================] - 31s 262ms/step - loss: 2.0472 - accuracy: 0.3956 - val_loss: 2.0893 - val_accuracy: 0.3816 Epoch 21/30 119/119 [==============================] - 32s 266ms/step - loss: 2.0298 - accuracy: 0.3978 - val_loss: 2.0850 - val_accuracy: 0.3816 Epoch 22/30 119/119 [==============================] - 31s 264ms/step - loss: 2.0139 - accuracy: 0.3992 - val_loss: 2.0716 - val_accuracy: 0.3953 Epoch 23/30 119/119 [==============================] - 31s 264ms/step - loss: 1.9966 - accuracy: 0.4054 - val_loss: 2.0595 - val_accuracy: 0.3901 Epoch 24/30 119/119 [==============================] - 36s 305ms/step - loss: 1.9844 - accuracy: 0.4075 - val_loss: 2.0470 - val_accuracy: 0.3943 Epoch 25/30 119/119 [==============================] - 32s 268ms/step - loss: 1.9644 - accuracy: 0.4122 - val_loss: 2.0328 - val_accuracy: 0.4017 Epoch 26/30 119/119 [==============================] - 31s 264ms/step - loss: 1.9518 - accuracy: 0.4151 - val_loss: 2.0436 - val_accuracy: 0.3879 Epoch 27/30 119/119 [==============================] - 31s 262ms/step - loss: 1.9390 - accuracy: 0.4257 - val_loss: 2.0132 - val_accuracy: 0.3922 Epoch 28/30 119/119 [==============================] - 31s 261ms/step - loss: 1.9265 - accuracy: 0.4238 - val_loss: 1.9929 - val_accuracy: 0.4154 Epoch 29/30 119/119 [==============================] - 32s 266ms/step - loss: 1.9097 - accuracy: 0.4321 - val_loss: 1.9731 - val_accuracy: 0.4101 Epoch 30/30 119/119 [==============================] - 31s 264ms/step - loss: 1.8956 - accuracy: 0.4364 - val_loss: 1.9788 - val_accuracy: 0.4017 15/15 [==============================] - 1s 77ms/step - loss: 1.9128 - accuracy: 0.4165
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32) (None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64) (None, 4, 4, 64) (None, 4, 4, 128) (None, 4, 4, 128)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
Epoch 1/30 119/119 [==============================] - 11s 93ms/step - loss: 2.3834 - accuracy: 0.2689 - val_loss: 3.4486 - val_accuracy: 0.0909 Epoch 2/30 119/119 [==============================] - 20s 171ms/step - loss: 1.9299 - accuracy: 0.4024 - val_loss: 3.0082 - val_accuracy: 0.1850 Epoch 3/30 119/119 [==============================] - 10s 88ms/step - loss: 1.6950 - accuracy: 0.4734 - val_loss: 2.7162 - val_accuracy: 0.2537 Epoch 4/30 119/119 [==============================] - 10s 85ms/step - loss: 1.5239 - accuracy: 0.5292 - val_loss: 2.0900 - val_accuracy: 0.3784 Epoch 5/30 119/119 [==============================] - 11s 93ms/step - loss: 1.3720 - accuracy: 0.5700 - val_loss: 2.0462 - val_accuracy: 0.4080 Epoch 6/30 119/119 [==============================] - 10s 85ms/step - loss: 1.2388 - accuracy: 0.6123 - val_loss: 1.9015 - val_accuracy: 0.4239 Epoch 7/30 119/119 [==============================] - 10s 86ms/step - loss: 1.1414 - accuracy: 0.6459 - val_loss: 1.9362 - val_accuracy: 0.4577 Epoch 8/30 119/119 [==============================] - 10s 84ms/step - loss: 1.0127 - accuracy: 0.6856 - val_loss: 1.7292 - val_accuracy: 0.4915 Epoch 9/30 119/119 [==============================] - 10s 85ms/step - loss: 0.8881 - accuracy: 0.7237 - val_loss: 1.7947 - val_accuracy: 0.4968 Epoch 10/30 119/119 [==============================] - 10s 86ms/step - loss: 0.7735 - accuracy: 0.7612 - val_loss: 2.2651 - val_accuracy: 0.4038 Epoch 11/30 119/119 [==============================] - 10s 85ms/step - loss: 0.6581 - accuracy: 0.7989 - val_loss: 2.1604 - val_accuracy: 0.4323 Epoch 12/30 119/119 [==============================] - 10s 86ms/step - loss: 0.5880 - accuracy: 0.8221 - val_loss: 2.1410 - val_accuracy: 0.4810 Epoch 13/30 119/119 [==============================] - 10s 87ms/step - loss: 0.4658 - accuracy: 0.8626 - val_loss: 2.3875 - val_accuracy: 0.4471 Epoch 14/30 119/119 [==============================] - 10s 86ms/step - loss: 0.3569 - accuracy: 0.8942 - val_loss: 2.3137 - val_accuracy: 0.4440 Epoch 15/30 119/119 [==============================] - 10s 86ms/step - loss: 0.2936 - accuracy: 0.9201 - val_loss: 2.4142 - val_accuracy: 0.4619 Epoch 16/30 119/119 [==============================] - 10s 85ms/step - loss: 0.3102 - accuracy: 0.9108 - val_loss: 2.0854 - val_accuracy: 0.5095 Epoch 17/30 119/119 [==============================] - 10s 88ms/step - loss: 0.1978 - accuracy: 0.9497 - val_loss: 2.2813 - val_accuracy: 0.4598 Epoch 18/30 119/119 [==============================] - 11s 88ms/step - loss: 0.1296 - accuracy: 0.9716 - val_loss: 2.6530 - val_accuracy: 0.4556 Epoch 19/30 119/119 [==============================] - 10s 86ms/step - loss: 0.1847 - accuracy: 0.9467 - val_loss: 2.5972 - val_accuracy: 0.4556 Epoch 20/30 119/119 [==============================] - 10s 86ms/step - loss: 0.2408 - accuracy: 0.9269 - val_loss: 2.9407 - val_accuracy: 0.4440 Epoch 21/30 119/119 [==============================] - 10s 86ms/step - loss: 0.1143 - accuracy: 0.9720 - val_loss: 2.1467 - val_accuracy: 0.5328 Epoch 22/30 119/119 [==============================] - 10s 84ms/step - loss: 0.0754 - accuracy: 0.9848 - val_loss: 2.1907 - val_accuracy: 0.5190 Epoch 23/30 119/119 [==============================] - 10s 86ms/step - loss: 0.1872 - accuracy: 0.9459 - val_loss: 2.6617 - val_accuracy: 0.4672 Epoch 24/30 119/119 [==============================] - 10s 86ms/step - loss: 0.0693 - accuracy: 0.9839 - val_loss: 2.7640 - val_accuracy: 0.4725 Epoch 25/30 119/119 [==============================] - 10s 84ms/step - loss: 0.0858 - accuracy: 0.9794 - val_loss: 2.8388 - val_accuracy: 0.4535 Epoch 26/30 119/119 [==============================] - 10s 87ms/step - loss: 0.0361 - accuracy: 0.9935 - val_loss: 2.2049 - val_accuracy: 0.5349 Epoch 27/30 119/119 [==============================] - 10s 85ms/step - loss: 0.0601 - accuracy: 0.9856 - val_loss: 2.9170 - val_accuracy: 0.4767 Epoch 28/30 119/119 [==============================] - 10s 86ms/step - loss: 0.2449 - accuracy: 0.9209 - val_loss: 2.6399 - val_accuracy: 0.4937 Epoch 29/30 119/119 [==============================] - 10s 87ms/step - loss: 0.1345 - accuracy: 0.9624 - val_loss: 2.3574 - val_accuracy: 0.5349 Epoch 30/30 119/119 [==============================] - 10s 85ms/step - loss: 0.0469 - accuracy: 0.9923 - val_loss: 2.3174 - val_accuracy: 0.5381 15/15 [==============================] - 0s 28ms/step - loss: 2.0742 - accuracy: 0.5877
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32) (None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 4, 4, 64) (None, 4, 4, 128) (None, 4, 4, 128) Epoch 1/30 119/119 [==============================] - 10s 85ms/step - loss: 2.6846 - accuracy: 0.1992 - val_loss: 3.0725 - val_accuracy: 0.0529 Epoch 2/30 119/119 [==============================] - 10s 84ms/step - loss: 2.2291 - accuracy: 0.3372 - val_loss: 3.0370 - val_accuracy: 0.1290 Epoch 3/30 119/119 [==============================] - 10s 83ms/step - loss: 2.0459 - accuracy: 0.3844 - val_loss: 2.8209 - val_accuracy: 0.1744 Epoch 4/30 119/119 [==============================] - 10s 87ms/step - loss: 1.9152 - accuracy: 0.4337 - val_loss: 2.4176 - val_accuracy: 0.2463 Epoch 5/30 119/119 [==============================] - 10s 87ms/step - loss: 1.8100 - accuracy: 0.4604 - val_loss: 1.9824 - val_accuracy: 0.3710 Epoch 6/30 119/119 [==============================] - 10s 83ms/step - loss: 1.7306 - accuracy: 0.4853 - val_loss: 1.9383 - val_accuracy: 0.3964 Epoch 7/30 119/119 [==============================] - 10s 84ms/step - loss: 1.6643 - accuracy: 0.4976 - val_loss: 1.8491 - val_accuracy: 0.4080 Epoch 8/30 119/119 [==============================] - 10s 84ms/step - loss: 1.5924 - accuracy: 0.5193 - val_loss: 1.8359 - val_accuracy: 0.4207 Epoch 9/30 119/119 [==============================] - 10s 84ms/step - loss: 1.5182 - accuracy: 0.5475 - val_loss: 1.8025 - val_accuracy: 0.4355 Epoch 10/30 119/119 [==============================] - 10s 85ms/step - loss: 1.4730 - accuracy: 0.5582 - val_loss: 1.7979 - val_accuracy: 0.4355 Epoch 11/30 119/119 [==============================] - 10s 84ms/step - loss: 1.4068 - accuracy: 0.5829 - val_loss: 1.7832 - val_accuracy: 0.4366 Epoch 12/30 119/119 [==============================] - 10s 85ms/step - loss: 1.3543 - accuracy: 0.5976 - val_loss: 1.7405 - val_accuracy: 0.4535 Epoch 13/30 119/119 [==============================] - 10s 84ms/step - loss: 1.3037 - accuracy: 0.6120 - val_loss: 1.7437 - val_accuracy: 0.4609 Epoch 14/30 119/119 [==============================] - 10s 84ms/step - loss: 1.2492 - accuracy: 0.6306 - val_loss: 1.7147 - val_accuracy: 0.4715 Epoch 15/30 119/119 [==============================] - 10s 85ms/step - loss: 1.2082 - accuracy: 0.6488 - val_loss: 1.7831 - val_accuracy: 0.4683 Epoch 16/30 119/119 [==============================] - 10s 83ms/step - loss: 1.1538 - accuracy: 0.6676 - val_loss: 1.7509 - val_accuracy: 0.4567 Epoch 17/30 119/119 [==============================] - 10s 85ms/step - loss: 1.1030 - accuracy: 0.6790 - val_loss: 1.7289 - val_accuracy: 0.4630 Epoch 18/30 119/119 [==============================] - 10s 84ms/step - loss: 1.0699 - accuracy: 0.6980 - val_loss: 1.7040 - val_accuracy: 0.4852 Epoch 19/30 119/119 [==============================] - 10s 84ms/step - loss: 1.0016 - accuracy: 0.7204 - val_loss: 1.7678 - val_accuracy: 0.4926 Epoch 20/30 119/119 [==============================] - 10s 86ms/step - loss: 0.9692 - accuracy: 0.7272 - val_loss: 1.7062 - val_accuracy: 0.4736 Epoch 21/30 119/119 [==============================] - 10s 84ms/step - loss: 0.9211 - accuracy: 0.7440 - val_loss: 1.7093 - val_accuracy: 0.4863 Epoch 22/30 119/119 [==============================] - 10s 83ms/step - loss: 0.8873 - accuracy: 0.7549 - val_loss: 1.7337 - val_accuracy: 0.4863 Epoch 23/30 119/119 [==============================] - 10s 85ms/step - loss: 0.8455 - accuracy: 0.7716 - val_loss: 1.7671 - val_accuracy: 0.4810 Epoch 24/30 119/119 [==============================] - 10s 84ms/step - loss: 0.7977 - accuracy: 0.7858 - val_loss: 1.7042 - val_accuracy: 0.4863 Epoch 25/30 119/119 [==============================] - 10s 84ms/step - loss: 0.7565 - accuracy: 0.7999 - val_loss: 1.7460 - val_accuracy: 0.4947 Epoch 26/30 119/119 [==============================] - 10s 85ms/step - loss: 0.7211 - accuracy: 0.8122 - val_loss: 1.7431 - val_accuracy: 0.5011 Epoch 27/30 119/119 [==============================] - 10s 83ms/step - loss: 0.6795 - accuracy: 0.8275 - val_loss: 1.7441 - val_accuracy: 0.4757 Epoch 28/30 119/119 [==============================] - 10s 84ms/step - loss: 0.6486 - accuracy: 0.8356 - val_loss: 1.7915 - val_accuracy: 0.4915 Epoch 29/30 119/119 [==============================] - 10s 85ms/step - loss: 0.6115 - accuracy: 0.8550 - val_loss: 1.7964 - val_accuracy: 0.5021 Epoch 30/30 119/119 [==============================] - 10s 84ms/step - loss: 0.5825 - accuracy: 0.8598 - val_loss: 1.7533 - val_accuracy: 0.4947 15/15 [==============================] - 0s 27ms/step - loss: 1.6744 - accuracy: 0.5201
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 64) (None, 32, 32, 128) (None, 32, 32, 128) Epoch 1/30 119/119 [==============================] - 92s 768ms/step - loss: 2.5061 - accuracy: 0.2392 - val_loss: 3.4325 - val_accuracy: 0.0782 Epoch 2/30 119/119 [==============================] - 92s 774ms/step - loss: 2.1400 - accuracy: 0.3385 - val_loss: 3.3008 - val_accuracy: 0.1501 Epoch 3/30 119/119 [==============================] - 91s 768ms/step - loss: 1.9637 - accuracy: 0.3882 - val_loss: 2.5070 - val_accuracy: 0.2812 Epoch 4/30 119/119 [==============================] - 94s 789ms/step - loss: 1.8381 - accuracy: 0.4343 - val_loss: 2.0858 - val_accuracy: 0.3753 Epoch 5/30 119/119 [==============================] - 95s 796ms/step - loss: 1.7251 - accuracy: 0.4607 - val_loss: 2.4850 - val_accuracy: 0.3161 Epoch 6/30 119/119 [==============================] - 93s 779ms/step - loss: 1.6210 - accuracy: 0.4950 - val_loss: 1.9385 - val_accuracy: 0.4006 Epoch 7/30 119/119 [==============================] - 92s 770ms/step - loss: 1.5456 - accuracy: 0.5159 - val_loss: 1.9169 - val_accuracy: 0.4080 Epoch 8/30 119/119 [==============================] - 92s 772ms/step - loss: 1.4859 - accuracy: 0.5369 - val_loss: 2.1015 - val_accuracy: 0.3922 Epoch 9/30 119/119 [==============================] - 92s 774ms/step - loss: 1.4041 - accuracy: 0.5604 - val_loss: 2.3631 - val_accuracy: 0.3658 Epoch 10/30 119/119 [==============================] - 93s 781ms/step - loss: 1.3373 - accuracy: 0.5787 - val_loss: 1.8812 - val_accuracy: 0.4440 Epoch 11/30 119/119 [==============================] - 92s 771ms/step - loss: 1.2806 - accuracy: 0.6016 - val_loss: 2.5300 - val_accuracy: 0.3298 Epoch 12/30 119/119 [==============================] - 92s 774ms/step - loss: 1.2212 - accuracy: 0.6242 - val_loss: 1.8837 - val_accuracy: 0.4408 Epoch 13/30 119/119 [==============================] - 92s 773ms/step - loss: 1.1705 - accuracy: 0.6331 - val_loss: 2.0375 - val_accuracy: 0.4503 Epoch 14/30 119/119 [==============================] - 92s 774ms/step - loss: 1.1305 - accuracy: 0.6483 - val_loss: 2.6786 - val_accuracy: 0.3488 Epoch 15/30 119/119 [==============================] - 91s 769ms/step - loss: 1.0788 - accuracy: 0.6599 - val_loss: 1.6440 - val_accuracy: 0.5381 Epoch 16/30 119/119 [==============================] - 91s 768ms/step - loss: 1.0483 - accuracy: 0.6697 - val_loss: 2.5340 - val_accuracy: 0.3869 Epoch 17/30 119/119 [==============================] - 93s 782ms/step - loss: 0.9899 - accuracy: 0.6934 - val_loss: 1.6603 - val_accuracy: 0.4968 Epoch 18/30 119/119 [==============================] - 93s 780ms/step - loss: 0.9572 - accuracy: 0.7004 - val_loss: 1.5398 - val_accuracy: 0.5539 Epoch 19/30 119/119 [==============================] - 92s 775ms/step - loss: 0.9097 - accuracy: 0.7177 - val_loss: 1.7929 - val_accuracy: 0.4979 Epoch 20/30 119/119 [==============================] - 92s 772ms/step - loss: 0.8833 - accuracy: 0.7218 - val_loss: 1.8473 - val_accuracy: 0.4757 Epoch 21/30 119/119 [==============================] - 93s 784ms/step - loss: 0.8216 - accuracy: 0.7417 - val_loss: 1.8516 - val_accuracy: 0.4958 Epoch 22/30 119/119 [==============================] - 91s 768ms/step - loss: 0.8137 - accuracy: 0.7497 - val_loss: 1.9918 - val_accuracy: 0.4926 Epoch 23/30 119/119 [==============================] - 92s 774ms/step - loss: 0.7512 - accuracy: 0.7706 - val_loss: 1.8060 - val_accuracy: 0.4831 Epoch 24/30 119/119 [==============================] - 93s 783ms/step - loss: 0.7303 - accuracy: 0.7741 - val_loss: 3.1188 - val_accuracy: 0.3467 Epoch 25/30 119/119 [==============================] - 93s 785ms/step - loss: 0.6880 - accuracy: 0.7888 - val_loss: 2.7505 - val_accuracy: 0.4440 Epoch 26/30 119/119 [==============================] - 92s 771ms/step - loss: 0.6537 - accuracy: 0.7983 - val_loss: 2.5711 - val_accuracy: 0.4038 Epoch 27/30 119/119 [==============================] - 92s 772ms/step - loss: 0.6306 - accuracy: 0.8067 - val_loss: 1.7983 - val_accuracy: 0.4968 Epoch 28/30 119/119 [==============================] - 92s 776ms/step - loss: 0.6035 - accuracy: 0.8185 - val_loss: 1.4119 - val_accuracy: 0.5877 Epoch 29/30 119/119 [==============================] - 93s 782ms/step - loss: 0.5894 - accuracy: 0.8247 - val_loss: 1.8459 - val_accuracy: 0.5349 Epoch 30/30 119/119 [==============================] - 92s 774ms/step - loss: 0.5462 - accuracy: 0.8327 - val_loss: 2.5446 - val_accuracy: 0.4334 15/15 [==============================] - 3s 223ms/step - loss: 2.4449 - accuracy: 0.4588
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64) (None, 32, 32, 64) (None, 32, 32, 128) (None, 32, 32, 128) Epoch 1/30 119/119 [==============================] - 93s 779ms/step - loss: 2.7825 - accuracy: 0.1837 - val_loss: 3.0065 - val_accuracy: 0.0655 Epoch 2/30 119/119 [==============================] - 93s 785ms/step - loss: 2.4630 - accuracy: 0.2571 - val_loss: 2.8331 - val_accuracy: 0.1173 Epoch 3/30 119/119 [==============================] - 94s 790ms/step - loss: 2.3604 - accuracy: 0.2903 - val_loss: 2.5022 - val_accuracy: 0.2230 Epoch 4/30 119/119 [==============================] - 93s 779ms/step - loss: 2.2689 - accuracy: 0.3184 - val_loss: 2.3186 - val_accuracy: 0.2653 Epoch 5/30 119/119 [==============================] - 92s 773ms/step - loss: 2.1968 - accuracy: 0.3452 - val_loss: 2.2480 - val_accuracy: 0.2918 Epoch 6/30 119/119 [==============================] - 93s 781ms/step - loss: 2.1397 - accuracy: 0.3509 - val_loss: 2.2450 - val_accuracy: 0.3140 Epoch 7/30 119/119 [==============================] - 92s 771ms/step - loss: 2.0902 - accuracy: 0.3675 - val_loss: 2.1603 - val_accuracy: 0.3235 Epoch 8/30 119/119 [==============================] - 92s 771ms/step - loss: 2.0456 - accuracy: 0.3817 - val_loss: 2.1233 - val_accuracy: 0.3393 Epoch 9/30 119/119 [==============================] - 92s 769ms/step - loss: 2.0043 - accuracy: 0.3937 - val_loss: 2.0802 - val_accuracy: 0.3520 Epoch 10/30 119/119 [==============================] - 93s 780ms/step - loss: 1.9670 - accuracy: 0.4077 - val_loss: 2.0376 - val_accuracy: 0.3700 Epoch 11/30 119/119 [==============================] - 92s 771ms/step - loss: 1.9261 - accuracy: 0.4175 - val_loss: 2.0072 - val_accuracy: 0.3858 Epoch 12/30 119/119 [==============================] - 92s 773ms/step - loss: 1.8915 - accuracy: 0.4271 - val_loss: 2.0146 - val_accuracy: 0.3901 Epoch 13/30 119/119 [==============================] - 94s 788ms/step - loss: 1.8642 - accuracy: 0.4376 - val_loss: 1.9898 - val_accuracy: 0.3911 Epoch 14/30 119/119 [==============================] - 95s 801ms/step - loss: 1.8304 - accuracy: 0.4459 - val_loss: 1.9197 - val_accuracy: 0.4133 Epoch 15/30 119/119 [==============================] - 93s 779ms/step - loss: 1.8056 - accuracy: 0.4615 - val_loss: 1.9078 - val_accuracy: 0.4070 Epoch 16/30 119/119 [==============================] - 92s 769ms/step - loss: 1.7728 - accuracy: 0.4669 - val_loss: 1.9442 - val_accuracy: 0.4080 Epoch 17/30 119/119 [==============================] - 92s 771ms/step - loss: 1.7483 - accuracy: 0.4697 - val_loss: 1.8707 - val_accuracy: 0.4186 Epoch 18/30 119/119 [==============================] - 92s 773ms/step - loss: 1.7280 - accuracy: 0.4847 - val_loss: 1.8787 - val_accuracy: 0.4186 Epoch 19/30 119/119 [==============================] - 93s 778ms/step - loss: 1.6937 - accuracy: 0.4958 - val_loss: 1.8903 - val_accuracy: 0.4419 Epoch 20/30 119/119 [==============================] - 92s 772ms/step - loss: 1.6760 - accuracy: 0.5009 - val_loss: 1.8343 - val_accuracy: 0.4249 Epoch 21/30 119/119 [==============================] - 92s 771ms/step - loss: 1.6454 - accuracy: 0.5046 - val_loss: 1.8050 - val_accuracy: 0.4440 Epoch 22/30 119/119 [==============================] - 93s 780ms/step - loss: 1.6285 - accuracy: 0.5123 - val_loss: 1.8272 - val_accuracy: 0.4408 Epoch 23/30 119/119 [==============================] - 92s 775ms/step - loss: 1.6033 - accuracy: 0.5274 - val_loss: 1.8058 - val_accuracy: 0.4524 Epoch 24/30 119/119 [==============================] - 92s 771ms/step - loss: 1.5884 - accuracy: 0.5283 - val_loss: 1.7961 - val_accuracy: 0.4588 Epoch 25/30 119/119 [==============================] - 91s 767ms/step - loss: 1.5607 - accuracy: 0.5253 - val_loss: 1.8142 - val_accuracy: 0.4493 Epoch 26/30 119/119 [==============================] - 93s 778ms/step - loss: 1.5357 - accuracy: 0.5433 - val_loss: 1.8431 - val_accuracy: 0.4387 Epoch 27/30 119/119 [==============================] - 92s 773ms/step - loss: 1.5262 - accuracy: 0.5489 - val_loss: 1.7957 - val_accuracy: 0.4630 Epoch 28/30 119/119 [==============================] - 92s 774ms/step - loss: 1.5147 - accuracy: 0.5585 - val_loss: 1.7441 - val_accuracy: 0.4715 Epoch 29/30 119/119 [==============================] - 92s 770ms/step - loss: 1.4942 - accuracy: 0.5582 - val_loss: 1.6971 - val_accuracy: 0.4915 Epoch 30/30 119/119 [==============================] - 93s 779ms/step - loss: 1.4704 - accuracy: 0.5623 - val_loss: 1.6810 - val_accuracy: 0.5032 15/15 [==============================] - 3s 207ms/step - loss: 1.6090 - accuracy: 0.5042
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32) (None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64) (None, 4, 4, 64) (None, 4, 4, 128) (None, 4, 4, 128) Epoch 1/30 119/119 [==============================] - 10s 87ms/step - loss: 2.3688 - accuracy: 0.2776 - val_loss: 3.2430 - val_accuracy: 0.0877 Epoch 2/30 119/119 [==============================] - 10s 85ms/step - loss: 1.9497 - accuracy: 0.3999 - val_loss: 2.8502 - val_accuracy: 0.1776 Epoch 3/30 119/119 [==============================] - 10s 86ms/step - loss: 1.7221 - accuracy: 0.4696 - val_loss: 2.5000 - val_accuracy: 0.2474 Epoch 4/30 119/119 [==============================] - 10s 85ms/step - loss: 1.5607 - accuracy: 0.5180 - val_loss: 2.0859 - val_accuracy: 0.3975 Epoch 5/30 119/119 [==============================] - 10s 86ms/step - loss: 1.4028 - accuracy: 0.5639 - val_loss: 1.8938 - val_accuracy: 0.4260 Epoch 6/30 119/119 [==============================] - 10s 86ms/step - loss: 1.2712 - accuracy: 0.6065 - val_loss: 1.8184 - val_accuracy: 0.4630 Epoch 7/30 119/119 [==============================] - 10s 85ms/step - loss: 1.1500 - accuracy: 0.6458 - val_loss: 2.0550 - val_accuracy: 0.4017 Epoch 8/30 119/119 [==============================] - 10s 85ms/step - loss: 1.0292 - accuracy: 0.6747 - val_loss: 1.9461 - val_accuracy: 0.4281 Epoch 9/30 119/119 [==============================] - 10s 86ms/step - loss: 0.9093 - accuracy: 0.7159 - val_loss: 2.0212 - val_accuracy: 0.4334 Epoch 10/30 119/119 [==============================] - 10s 85ms/step - loss: 0.8079 - accuracy: 0.7491 - val_loss: 1.9461 - val_accuracy: 0.4757 Epoch 11/30 119/119 [==============================] - 10s 86ms/step - loss: 0.6547 - accuracy: 0.7992 - val_loss: 1.9913 - val_accuracy: 0.4736 Epoch 12/30 119/119 [==============================] - 10s 85ms/step - loss: 0.5649 - accuracy: 0.8245 - val_loss: 1.8740 - val_accuracy: 0.5000 Epoch 13/30 119/119 [==============================] - 10s 85ms/step - loss: 0.4339 - accuracy: 0.8747 - val_loss: 2.6336 - val_accuracy: 0.4091 Epoch 14/30 119/119 [==============================] - 10s 84ms/step - loss: 0.3573 - accuracy: 0.9004 - val_loss: 2.1331 - val_accuracy: 0.4641 Epoch 15/30 119/119 [==============================] - 10s 85ms/step - loss: 0.3025 - accuracy: 0.9152 - val_loss: 2.7283 - val_accuracy: 0.3964 Epoch 16/30 119/119 [==============================] - 10s 85ms/step - loss: 0.2696 - accuracy: 0.9220 - val_loss: 2.2520 - val_accuracy: 0.4556 Epoch 17/30 119/119 [==============================] - 10s 85ms/step - loss: 0.1764 - accuracy: 0.9567 - val_loss: 3.2836 - val_accuracy: 0.3911 Epoch 18/30 119/119 [==============================] - 10s 84ms/step - loss: 0.2429 - accuracy: 0.9272 - val_loss: 2.9551 - val_accuracy: 0.4281 Epoch 19/30 119/119 [==============================] - 10s 86ms/step - loss: 0.1590 - accuracy: 0.9589 - val_loss: 2.4871 - val_accuracy: 0.4693 Epoch 20/30 119/119 [==============================] - 10s 85ms/step - loss: 0.1034 - accuracy: 0.9778 - val_loss: 2.6174 - val_accuracy: 0.4567 Epoch 21/30 119/119 [==============================] - 10s 86ms/step - loss: 0.0703 - accuracy: 0.9862 - val_loss: 2.4813 - val_accuracy: 0.4746 Epoch 22/30 119/119 [==============================] - 10s 85ms/step - loss: 0.1117 - accuracy: 0.9702 - val_loss: 2.3009 - val_accuracy: 0.4884 Epoch 23/30 119/119 [==============================] - 10s 85ms/step - loss: 0.0915 - accuracy: 0.9767 - val_loss: 2.5429 - val_accuracy: 0.4820 Epoch 24/30 119/119 [==============================] - 10s 84ms/step - loss: 0.0607 - accuracy: 0.9884 - val_loss: 3.0438 - val_accuracy: 0.4609 Epoch 25/30 119/119 [==============================] - 10s 85ms/step - loss: 0.1500 - accuracy: 0.9537 - val_loss: 3.0143 - val_accuracy: 0.4387 Epoch 26/30 119/119 [==============================] - 10s 85ms/step - loss: 0.0676 - accuracy: 0.9841 - val_loss: 3.0755 - val_accuracy: 0.4598 Epoch 27/30 119/119 [==============================] - 10s 85ms/step - loss: 0.0304 - accuracy: 0.9960 - val_loss: 2.2313 - val_accuracy: 0.5338 Epoch 28/30 119/119 [==============================] - 10s 85ms/step - loss: 0.0476 - accuracy: 0.9905 - val_loss: 2.6797 - val_accuracy: 0.5011 Epoch 29/30 119/119 [==============================] - 10s 84ms/step - loss: 0.1747 - accuracy: 0.9447 - val_loss: 3.6261 - val_accuracy: 0.4006 Epoch 30/30 119/119 [==============================] - 10s 85ms/step - loss: 0.1255 - accuracy: 0.9604 - val_loss: 2.9908 - val_accuracy: 0.4630 15/15 [==============================] - 0s 28ms/step - loss: 2.7010 - accuracy: 0.4937
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 16, 16, 16) (None, 16, 16, 32) (None, 16, 16, 32) (None, 8, 8, 32) (None, 8, 8, 64) (None, 8, 8, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 4, 4, 64) (None, 4, 4, 128) (None, 4, 4, 128) Epoch 1/30 119/119 [==============================] - 11s 88ms/step - loss: 2.7374 - accuracy: 0.1974 - val_loss: 3.0253 - val_accuracy: 0.0698 Epoch 2/30 119/119 [==============================] - 10s 86ms/step - loss: 2.3000 - accuracy: 0.3249 - val_loss: 3.1451 - val_accuracy: 0.0951 Epoch 3/30 119/119 [==============================] - 10s 86ms/step - loss: 2.1184 - accuracy: 0.3692 - val_loss: 2.7375 - val_accuracy: 0.1966 Epoch 4/30 119/119 [==============================] - 10s 85ms/step - loss: 1.9682 - accuracy: 0.4147 - val_loss: 2.1943 - val_accuracy: 0.3214 Epoch 5/30 119/119 [==============================] - 10s 83ms/step - loss: 1.8559 - accuracy: 0.4460 - val_loss: 2.0400 - val_accuracy: 0.3700 Epoch 6/30 119/119 [==============================] - 10s 84ms/step - loss: 1.7722 - accuracy: 0.4697 - val_loss: 1.9833 - val_accuracy: 0.3996 Epoch 7/30 119/119 [==============================] - 10s 86ms/step - loss: 1.6983 - accuracy: 0.4874 - val_loss: 1.8708 - val_accuracy: 0.4091 Epoch 8/30 119/119 [==============================] - 10s 84ms/step - loss: 1.6215 - accuracy: 0.5126 - val_loss: 1.8408 - val_accuracy: 0.4345 Epoch 9/30 119/119 [==============================] - 10s 85ms/step - loss: 1.5580 - accuracy: 0.5323 - val_loss: 1.8092 - val_accuracy: 0.4408 Epoch 10/30 119/119 [==============================] - 10s 84ms/step - loss: 1.4958 - accuracy: 0.5474 - val_loss: 1.8041 - val_accuracy: 0.4228 Epoch 11/30 119/119 [==============================] - 10s 84ms/step - loss: 1.4365 - accuracy: 0.5681 - val_loss: 1.7991 - val_accuracy: 0.4471 Epoch 12/30 119/119 [==============================] - 10s 84ms/step - loss: 1.3794 - accuracy: 0.5847 - val_loss: 1.7582 - val_accuracy: 0.4524 Epoch 13/30 119/119 [==============================] - 10s 85ms/step - loss: 1.3263 - accuracy: 0.5976 - val_loss: 1.7915 - val_accuracy: 0.4440 Epoch 14/30 119/119 [==============================] - 10s 86ms/step - loss: 1.2739 - accuracy: 0.6192 - val_loss: 1.7526 - val_accuracy: 0.4440 Epoch 15/30 119/119 [==============================] - 10s 83ms/step - loss: 1.2243 - accuracy: 0.6377 - val_loss: 1.7689 - val_accuracy: 0.4524 Epoch 16/30 119/119 [==============================] - 10s 84ms/step - loss: 1.1676 - accuracy: 0.6536 - val_loss: 1.6976 - val_accuracy: 0.4799 Epoch 17/30 119/119 [==============================] - 10s 84ms/step - loss: 1.1202 - accuracy: 0.6704 - val_loss: 1.6949 - val_accuracy: 0.4704 Epoch 18/30 119/119 [==============================] - 10s 84ms/step - loss: 1.0843 - accuracy: 0.6881 - val_loss: 1.7078 - val_accuracy: 0.4831 Epoch 19/30 119/119 [==============================] - 10s 84ms/step - loss: 1.0266 - accuracy: 0.7036 - val_loss: 1.6967 - val_accuracy: 0.4799 Epoch 20/30 119/119 [==============================] - 10s 84ms/step - loss: 0.9905 - accuracy: 0.7139 - val_loss: 1.7196 - val_accuracy: 0.4683 Epoch 21/30 119/119 [==============================] - 10s 84ms/step - loss: 0.9409 - accuracy: 0.7356 - val_loss: 1.7235 - val_accuracy: 0.4757 Epoch 22/30 119/119 [==============================] - 10s 85ms/step - loss: 0.9000 - accuracy: 0.7466 - val_loss: 1.7483 - val_accuracy: 0.4767 Epoch 23/30 119/119 [==============================] - 10s 84ms/step - loss: 0.8537 - accuracy: 0.7642 - val_loss: 1.7941 - val_accuracy: 0.4641 Epoch 24/30 119/119 [==============================] - 10s 84ms/step - loss: 0.8117 - accuracy: 0.7734 - val_loss: 1.7263 - val_accuracy: 0.4789 Epoch 25/30 119/119 [==============================] - 10s 85ms/step - loss: 0.7765 - accuracy: 0.7913 - val_loss: 1.7320 - val_accuracy: 0.4831 Epoch 26/30 119/119 [==============================] - 10s 84ms/step - loss: 0.7347 - accuracy: 0.8066 - val_loss: 1.7658 - val_accuracy: 0.4841 Epoch 27/30 119/119 [==============================] - 10s 84ms/step - loss: 0.6961 - accuracy: 0.8221 - val_loss: 1.7102 - val_accuracy: 0.4989 Epoch 28/30 119/119 [==============================] - 10s 85ms/step - loss: 0.6617 - accuracy: 0.8312 - val_loss: 1.7539 - val_accuracy: 0.4831 Epoch 29/30 119/119 [==============================] - 10s 83ms/step - loss: 0.6246 - accuracy: 0.8435 - val_loss: 1.7262 - val_accuracy: 0.4968 Epoch 30/30 119/119 [==============================] - 10s 84ms/step - loss: 0.5900 - accuracy: 0.8567 - val_loss: 1.7446 - val_accuracy: 0.4979 15/15 [==============================] - 0s 29ms/step - loss: 1.7125 - accuracy: 0.5000
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64)
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 64) (None, 32, 32, 128) (None, 32, 32, 128) Epoch 1/30 119/119 [==============================] - 93s 778ms/step - loss: 2.4797 - accuracy: 0.2426 - val_loss: 3.2703 - val_accuracy: 0.0751 Epoch 2/30 119/119 [==============================] - 92s 771ms/step - loss: 2.1218 - accuracy: 0.3447 - val_loss: 3.5075 - val_accuracy: 0.1036 Epoch 3/30 119/119 [==============================] - 92s 771ms/step - loss: 1.9604 - accuracy: 0.3886 - val_loss: 2.7113 - val_accuracy: 0.1712 Epoch 4/30 119/119 [==============================] - 93s 778ms/step - loss: 1.8223 - accuracy: 0.4385 - val_loss: 2.1073 - val_accuracy: 0.3256 Epoch 5/30 119/119 [==============================] - 93s 781ms/step - loss: 1.7025 - accuracy: 0.4733 - val_loss: 2.7600 - val_accuracy: 0.2801 Epoch 6/30 119/119 [==============================] - 92s 770ms/step - loss: 1.5933 - accuracy: 0.5062 - val_loss: 2.0854 - val_accuracy: 0.3626 Epoch 7/30 119/119 [==============================] - 92s 773ms/step - loss: 1.5110 - accuracy: 0.5312 - val_loss: 1.8774 - val_accuracy: 0.4154 Epoch 8/30 119/119 [==============================] - 91s 768ms/step - loss: 1.4501 - accuracy: 0.5468 - val_loss: 2.0774 - val_accuracy: 0.3531 Epoch 9/30 119/119 [==============================] - 94s 787ms/step - loss: 1.3587 - accuracy: 0.5761 - val_loss: 2.2962 - val_accuracy: 0.3763 Epoch 10/30 119/119 [==============================] - 92s 776ms/step - loss: 1.3063 - accuracy: 0.5950 - val_loss: 2.2757 - val_accuracy: 0.3605 Epoch 11/30 119/119 [==============================] - 92s 773ms/step - loss: 1.2354 - accuracy: 0.6153 - val_loss: 1.8795 - val_accuracy: 0.4397 Epoch 12/30 119/119 [==============================] - 92s 772ms/step - loss: 1.1946 - accuracy: 0.6312 - val_loss: 1.8045 - val_accuracy: 0.4810 Epoch 13/30 119/119 [==============================] - 93s 782ms/step - loss: 1.1389 - accuracy: 0.6456 - val_loss: 2.7180 - val_accuracy: 0.3869 Epoch 14/30 119/119 [==============================] - 92s 773ms/step - loss: 1.0965 - accuracy: 0.6554 - val_loss: 2.5244 - val_accuracy: 0.3076 Epoch 15/30 119/119 [==============================] - 92s 777ms/step - loss: 1.0353 - accuracy: 0.6772 - val_loss: 2.3583 - val_accuracy: 0.4027 Epoch 16/30 119/119 [==============================] - 93s 781ms/step - loss: 1.0066 - accuracy: 0.6843 - val_loss: 1.9134 - val_accuracy: 0.4545 Epoch 17/30 119/119 [==============================] - 95s 795ms/step - loss: 0.9455 - accuracy: 0.7075 - val_loss: 1.7326 - val_accuracy: 0.4841 Epoch 18/30 119/119 [==============================] - 92s 773ms/step - loss: 0.9183 - accuracy: 0.7060 - val_loss: 2.0191 - val_accuracy: 0.4059 Epoch 19/30 119/119 [==============================] - 92s 773ms/step - loss: 0.8671 - accuracy: 0.7340 - val_loss: 3.5855 - val_accuracy: 0.3541 Epoch 20/30 119/119 [==============================] - 92s 773ms/step - loss: 0.8450 - accuracy: 0.7370 - val_loss: 1.6131 - val_accuracy: 0.5254 Epoch 21/30 119/119 [==============================] - 93s 778ms/step - loss: 0.7802 - accuracy: 0.7513 - val_loss: 2.8614 - val_accuracy: 0.3753 Epoch 22/30 119/119 [==============================] - 92s 776ms/step - loss: 0.7714 - accuracy: 0.7626 - val_loss: 1.6595 - val_accuracy: 0.5497 Epoch 23/30 119/119 [==============================] - 92s 774ms/step - loss: 0.7118 - accuracy: 0.7821 - val_loss: 1.9047 - val_accuracy: 0.4651 Epoch 24/30 119/119 [==============================] - 93s 778ms/step - loss: 0.6736 - accuracy: 0.7946 - val_loss: 1.5402 - val_accuracy: 0.5423 Epoch 25/30 119/119 [==============================] - 92s 769ms/step - loss: 0.6358 - accuracy: 0.8053 - val_loss: 1.7402 - val_accuracy: 0.5190 Epoch 26/30 119/119 [==============================] - 92s 770ms/step - loss: 0.6005 - accuracy: 0.8188 - val_loss: 1.7459 - val_accuracy: 0.5391 Epoch 27/30 119/119 [==============================] - 92s 770ms/step - loss: 0.5894 - accuracy: 0.8151 - val_loss: 1.4233 - val_accuracy: 0.5740 Epoch 28/30 119/119 [==============================] - 93s 779ms/step - loss: 0.5485 - accuracy: 0.8298 - val_loss: 2.3639 - val_accuracy: 0.4249 Epoch 29/30 119/119 [==============================] - 92s 774ms/step - loss: 0.5281 - accuracy: 0.8398 - val_loss: 1.8030 - val_accuracy: 0.5169 Epoch 30/30 119/119 [==============================] - 92s 774ms/step - loss: 0.4911 - accuracy: 0.8513 - val_loss: 2.3424 - val_accuracy: 0.4693 15/15 [==============================] - 3s 207ms/step - loss: 2.1687 - accuracy: 0.4937
WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`. WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.
(None, 32, 32, 3) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 16) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 32) (None, 32, 32, 64) (None, 32, 32, 64) (None, 32, 32, 64) (None, 32, 32, 128) (None, 32, 32, 128) Epoch 1/30 119/119 [==============================] - 92s 771ms/step - loss: 2.7917 - accuracy: 0.1835 - val_loss: 3.1278 - val_accuracy: 0.0761 Epoch 2/30 119/119 [==============================] - 92s 777ms/step - loss: 2.4628 - accuracy: 0.2754 - val_loss: 3.0352 - val_accuracy: 0.1004 Epoch 3/30 119/119 [==============================] - 93s 778ms/step - loss: 2.3483 - accuracy: 0.2951 - val_loss: 2.6549 - val_accuracy: 0.1892 Epoch 4/30 119/119 [==============================] - 92s 772ms/step - loss: 2.2489 - accuracy: 0.3216 - val_loss: 2.3375 - val_accuracy: 0.3002 Epoch 5/30 119/119 [==============================] - 94s 786ms/step - loss: 2.1659 - accuracy: 0.3525 - val_loss: 2.2989 - val_accuracy: 0.2812 Epoch 6/30 119/119 [==============================] - 94s 787ms/step - loss: 2.1007 - accuracy: 0.3656 - val_loss: 2.2593 - val_accuracy: 0.3044 Epoch 7/30 119/119 [==============================] - 92s 770ms/step - loss: 2.0516 - accuracy: 0.3866 - val_loss: 2.1470 - val_accuracy: 0.3340 Epoch 8/30 119/119 [==============================] - 92s 773ms/step - loss: 1.9989 - accuracy: 0.3987 - val_loss: 2.0914 - val_accuracy: 0.3710 Epoch 9/30 119/119 [==============================] - 93s 781ms/step - loss: 1.9603 - accuracy: 0.4111 - val_loss: 2.0989 - val_accuracy: 0.3520 Epoch 10/30 119/119 [==============================] - 92s 775ms/step - loss: 1.9232 - accuracy: 0.4239 - val_loss: 2.0475 - val_accuracy: 0.3605 Epoch 11/30 119/119 [==============================] - 91s 769ms/step - loss: 1.8817 - accuracy: 0.4324 - val_loss: 2.0483 - val_accuracy: 0.3975 Epoch 12/30 119/119 [==============================] - 92s 770ms/step - loss: 1.8541 - accuracy: 0.4422 - val_loss: 1.9675 - val_accuracy: 0.3953 Epoch 13/30 119/119 [==============================] - 92s 774ms/step - loss: 1.8195 - accuracy: 0.4549 - val_loss: 2.0158 - val_accuracy: 0.3932 Epoch 14/30 119/119 [==============================] - 92s 773ms/step - loss: 1.7932 - accuracy: 0.4597 - val_loss: 1.9514 - val_accuracy: 0.4038 Epoch 15/30 119/119 [==============================] - 91s 768ms/step - loss: 1.7682 - accuracy: 0.4730 - val_loss: 1.9430 - val_accuracy: 0.4027 Epoch 16/30 119/119 [==============================] - 91s 768ms/step - loss: 1.7372 - accuracy: 0.4810 - val_loss: 1.8798 - val_accuracy: 0.4154 Epoch 17/30 119/119 [==============================] - 93s 778ms/step - loss: 1.7101 - accuracy: 0.4906 - val_loss: 1.8655 - val_accuracy: 0.4281 Epoch 18/30 119/119 [==============================] - 91s 769ms/step - loss: 1.6952 - accuracy: 0.4918 - val_loss: 1.8215 - val_accuracy: 0.4408 Epoch 19/30 119/119 [==============================] - 92s 771ms/step - loss: 1.6544 - accuracy: 0.5077 - val_loss: 1.8703 - val_accuracy: 0.4197 Epoch 20/30 119/119 [==============================] - 92s 774ms/step - loss: 1.6425 - accuracy: 0.5085 - val_loss: 1.8420 - val_accuracy: 0.4355 Epoch 21/30 119/119 [==============================] - 92s 774ms/step - loss: 1.6178 - accuracy: 0.5138 - val_loss: 1.8107 - val_accuracy: 0.4366 Epoch 22/30 119/119 [==============================] - 92s 773ms/step - loss: 1.6006 - accuracy: 0.5209 - val_loss: 1.8185 - val_accuracy: 0.4281 Epoch 23/30 119/119 [==============================] - 92s 771ms/step - loss: 1.5753 - accuracy: 0.5224 - val_loss: 1.8106 - val_accuracy: 0.4387 Epoch 24/30 119/119 [==============================] - 93s 786ms/step - loss: 1.5625 - accuracy: 0.5299 - val_loss: 1.8971 - val_accuracy: 0.4271 Epoch 25/30 119/119 [==============================] - 95s 802ms/step - loss: 1.5351 - accuracy: 0.5380 - val_loss: 1.7986 - val_accuracy: 0.4535 Epoch 26/30 119/119 [==============================] - 93s 777ms/step - loss: 1.5152 - accuracy: 0.5438 - val_loss: 1.9466 - val_accuracy: 0.4080 Epoch 27/30 119/119 [==============================] - 92s 775ms/step - loss: 1.5019 - accuracy: 0.5505 - val_loss: 1.7746 - val_accuracy: 0.4503 Epoch 28/30 119/119 [==============================] - 93s 783ms/step - loss: 1.4850 - accuracy: 0.5622 - val_loss: 1.7164 - val_accuracy: 0.4577 Epoch 29/30 119/119 [==============================] - 92s 771ms/step - loss: 1.4686 - accuracy: 0.5642 - val_loss: 1.7044 - val_accuracy: 0.4725 Epoch 30/30 119/119 [==============================] - 93s 778ms/step - loss: 1.4490 - accuracy: 0.5706 - val_loss: 1.7182 - val_accuracy: 0.4715 15/15 [==============================] - 3s 217ms/step - loss: 1.5830 - accuracy: 0.5159
print (best_config)
(4, True, True, 0.001, 0.5877378582954407)
**Question 3.3** We now try to apply data augmentation to improve the performance. Extend the code of the class YourModel so that if the attribute is_augmentation is set to True, we apply the data augmentation. Also you need to incorporate early stopping to your training process. Specifically, you early stop the training if the valid accuracy cannot increase in three consecutive epochs.
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping
Wtire your code in the cell below. Hint that you can rewrite the code of the fit method to apply the data augmentation. In addition, you can copy the code of build_cnn method above to reuse here.
class YourModel(DefaultModel):
def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
name='network1',
width=32, height=32, depth=3,
num_classes=20,
is_augmentation = False,
activation_func='relu',
optimizer='adam',
batch_size=32,
num_epochs= 20):
super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation,
activation_func, optimizer, batch_size, num_epochs,
learning_rate, verbose)
self.num_channels = num_channels
self.mean_pool = mean_pool
self.batch_norm = batch_norm
self.use_skip = use_skip
self.blocks = blocks
def build_cnn(self,x):
#Insert your code here
x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
if self.batch_norm:
x1 = layers.BatchNormalization() (x1)
x1 = layers.Activation('relu') (x1)
x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
if self.batch_norm:
x2 = layers.BatchNormalization()(x2)
if x.shape != x2.shape:
if x2.shape[3] > x.shape[3]:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
else:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
x2_skip = layers.add([x, x2])
x2_skip = layers.Activation('relu')(x2_skip)
if self.mean_pool:
output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
else:
output_layer = x2_skip
return output_layer
def build_resnet(self):
self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
x = self.input_layer
for i in range (self.blocks):
x = self.build_cnn(x)
self.num_channels = self.num_channels*2
output_layer = GlobalAveragePooling2D()(x)
output_layer = layers.Flatten()(output_layer)
output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
def fit(self, data_manager, batch_size=None, num_epochs=None):
#Insert your code here
batch_size = self.batch_size if batch_size is None else batch_size
num_epochs = self.num_epochs if num_epochs is None else num_epochs
self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_accuracy', patience=3)
callbacks = [early_stopping]
if self.is_augmentation == True:
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
width_shift_range=0.05,
zoom_range = 0.05,
rotation_range=5
)
datagen.fit(data_manager.X_train)
it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = True, batch_size =batch_size)
self.history = self.model.fit(x = it,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
else:
self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
Leverage your best model with the data augmentation and try to observe the difference in performance between using data augmentation and non-using it.
By leveraging the findings from above (best config):
| Blocks | Skip | Pool | Rate | Accuracy |
|---|---|---|---|---|
| 4 | True | True | 0.001 | 58.77% |
I have tried to train it on both with and without data augmentation. Let's look at the results here:
With Data Augmentation + Early Stopping Accuracy:
Without Data Augmentation + Early Stopping Accuracy:
As we can see, data augmentation produces lower training accuracy when compared to without data augmentation. This is due to the fact that data augmentation makes changes to the original training data thus diversifying it and allowing our model to learn how to generalise better instead of relying solely on learning patterns.This can be seen in the higher validation and test accuracy in our model with data augmentation
#Insert your code here. You can add more cells if necessary
testModel = YourModel(32,4, True, True,True, 0.001,True)
testModel.build_resnet()
testModel.fit(data_manager, batch_size = 16, num_epochs = 30)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
Epoch 1/30 473/473 [==============================] - 20s 41ms/step - loss: 2.5141 - accuracy: 0.2286 - val_loss: 2.5227 - val_accuracy: 0.2442 Epoch 2/30 473/473 [==============================] - 20s 41ms/step - loss: 2.1007 - accuracy: 0.3468 - val_loss: 2.0654 - val_accuracy: 0.3668 Epoch 3/30 473/473 [==============================] - 20s 42ms/step - loss: 1.8436 - accuracy: 0.4198 - val_loss: 1.8843 - val_accuracy: 0.4228 Epoch 4/30 473/473 [==============================] - 20s 41ms/step - loss: 1.6569 - accuracy: 0.4819 - val_loss: 1.7000 - val_accuracy: 0.4662 Epoch 5/30 473/473 [==============================] - 19s 41ms/step - loss: 1.4680 - accuracy: 0.5409 - val_loss: 1.7330 - val_accuracy: 0.4746 Epoch 6/30 473/473 [==============================] - 20s 43ms/step - loss: 1.3347 - accuracy: 0.5811 - val_loss: 1.6279 - val_accuracy: 0.5148 Epoch 7/30 473/473 [==============================] - 19s 41ms/step - loss: 1.2063 - accuracy: 0.6224 - val_loss: 1.5319 - val_accuracy: 0.5423 Epoch 8/30 473/473 [==============================] - 20s 42ms/step - loss: 1.0968 - accuracy: 0.6548 - val_loss: 1.5071 - val_accuracy: 0.5497 Epoch 9/30 473/473 [==============================] - 19s 41ms/step - loss: 0.9708 - accuracy: 0.6910 - val_loss: 1.7288 - val_accuracy: 0.5349 Epoch 10/30 473/473 [==============================] - 19s 40ms/step - loss: 0.8555 - accuracy: 0.7279 - val_loss: 1.7070 - val_accuracy: 0.5201 Epoch 11/30 473/473 [==============================] - 19s 40ms/step - loss: 0.7318 - accuracy: 0.7672 - val_loss: 1.5414 - val_accuracy: 0.5698 Epoch 12/30 473/473 [==============================] - 19s 40ms/step - loss: 0.6186 - accuracy: 0.8026 - val_loss: 1.4291 - val_accuracy: 0.6099 Epoch 13/30 473/473 [==============================] - 19s 41ms/step - loss: 0.5340 - accuracy: 0.8303 - val_loss: 1.6177 - val_accuracy: 0.5793 Epoch 14/30 473/473 [==============================] - 19s 40ms/step - loss: 0.4371 - accuracy: 0.8618 - val_loss: 1.6358 - val_accuracy: 0.6004 Epoch 15/30 473/473 [==============================] - 19s 40ms/step - loss: 0.3581 - accuracy: 0.8847 - val_loss: 1.7178 - val_accuracy: 0.5698 15/15 [==============================] - 0s 20ms/step - loss: 1.6682 - accuracy: 0.5751
0.5750528573989868
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 11ms/step
<Figure size 640x480 with 0 Axes>
testModel.model.save('models/augmentation_true_model.h5')
class YourModel(DefaultModel):
def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
name='network1',
width=32, height=32, depth=3,
num_classes=20,
is_augmentation = False,
activation_func='relu',
optimizer='adam',
batch_size=32,
num_epochs= 20):
super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation,
activation_func, optimizer, batch_size, num_epochs,
learning_rate, verbose)
self.num_channels = num_channels
self.mean_pool = mean_pool
self.batch_norm = batch_norm
self.use_skip = use_skip
self.blocks = blocks
def build_cnn(self,x):
#Insert your code here
x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
if self.batch_norm:
x1 = layers.BatchNormalization() (x1)
x1 = layers.Activation('relu') (x1)
x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
if self.batch_norm:
x2 = layers.BatchNormalization()(x2)
if x.shape != x2.shape:
if x2.shape[3] > x.shape[3]:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
else:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
x2_skip = layers.add([x, x2])
x2_skip = layers.Activation('relu')(x2_skip)
if self.mean_pool:
output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
else:
output_layer = x2_skip
return output_layer
def build_resnet(self):
self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
x = self.input_layer
for i in range (self.blocks):
x = self.build_cnn(x)
self.num_channels = self.num_channels*2
output_layer = GlobalAveragePooling2D()(x)
output_layer = layers.Flatten()(output_layer)
output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
def fit(self, data_manager, batch_size=None, num_epochs=None):
#Insert your code here
batch_size = self.batch_size if batch_size is None else batch_size
num_epochs = self.num_epochs if num_epochs is None else num_epochs
self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_accuracy', patience=3)
callbacks = [early_stopping]
if self.is_augmentation == True:
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
brightness_range=(0.9, 1.1),
horizontal_flip=True,
width_shift_range=0.2,
height_shift_range=0.1,
rotation_range= 10,
zoom_range = 0.1
)
datagen.fit(data_manager.X_train)
it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = False, batch_size =batch_size)
self.history = self.model.fit(x = it,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
else:
self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
testModel = YourModel(32,4, True, True,True, 0.001,True)
testModel.build_resnet()
testModel.fit(data_manager, batch_size = 16, num_epochs = 20)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
Epoch 1/20 473/473 [==============================] - 21s 44ms/step - loss: 2.4680 - accuracy: 0.2439 - val_loss: 2.4805 - val_accuracy: 0.2463 Epoch 2/20 473/473 [==============================] - 20s 43ms/step - loss: 2.0364 - accuracy: 0.3698 - val_loss: 2.0306 - val_accuracy: 0.3837 Epoch 3/20 473/473 [==============================] - 20s 42ms/step - loss: 1.7709 - accuracy: 0.4516 - val_loss: 1.8833 - val_accuracy: 0.4397 Epoch 4/20 473/473 [==============================] - 20s 42ms/step - loss: 1.5909 - accuracy: 0.5071 - val_loss: 1.7732 - val_accuracy: 0.4545 Epoch 5/20 473/473 [==============================] - 20s 41ms/step - loss: 1.4055 - accuracy: 0.5636 - val_loss: 1.5745 - val_accuracy: 0.5095 Epoch 6/20 473/473 [==============================] - 19s 41ms/step - loss: 1.2683 - accuracy: 0.6052 - val_loss: 1.5717 - val_accuracy: 0.5148 Epoch 7/20 473/473 [==============================] - 20s 42ms/step - loss: 1.1440 - accuracy: 0.6467 - val_loss: 1.5202 - val_accuracy: 0.5613 Epoch 8/20 473/473 [==============================] - 20s 42ms/step - loss: 1.0157 - accuracy: 0.6758 - val_loss: 1.4315 - val_accuracy: 0.5645 Epoch 9/20 473/473 [==============================] - 20s 42ms/step - loss: 0.8867 - accuracy: 0.7190 - val_loss: 1.9850 - val_accuracy: 0.4799 Epoch 10/20 473/473 [==============================] - 20s 42ms/step - loss: 0.7710 - accuracy: 0.7516 - val_loss: 1.5861 - val_accuracy: 0.5455 Epoch 11/20 473/473 [==============================] - 20s 42ms/step - loss: 0.6408 - accuracy: 0.7909 - val_loss: 1.6589 - val_accuracy: 0.5793 Epoch 12/20 473/473 [==============================] - 20s 42ms/step - loss: 0.5228 - accuracy: 0.8315 - val_loss: 1.7419 - val_accuracy: 0.5603 Epoch 13/20 473/473 [==============================] - 20s 42ms/step - loss: 0.4382 - accuracy: 0.8575 - val_loss: 1.7577 - val_accuracy: 0.5687 Epoch 14/20 473/473 [==============================] - 20s 41ms/step - loss: 0.3249 - accuracy: 0.8971 - val_loss: 2.1832 - val_accuracy: 0.5233 15/15 [==============================] - 0s 19ms/step - loss: 2.2280 - accuracy: 0.5211
0.5211416482925415
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 16ms/step
<Figure size 640x480 with 0 Axes>
testModel.model.save('models/augmentation_false_model.h5')
**Question 3.4** Exploring Data Mixup Technique for Improving Generalization Ability.
Data mixup is another super-simple technique used to boost the generalization ability of deep learning models. You need to incoroporate data mixup technique to the above deep learning model and experiment its performance. There are some papers and documents for data mixup as follows:
You need to extend your model developed above, train a model using data mixup, and write your observations and comments about the result.
With Data Augmentation + Early Stopping Accuracy:
With all the above and data mixup:
We can observe a decrease in performance when data mixup is added. This can be due to several factors which includes: The introduced augmentation to the data may mean our model is too simple to recognise patterns within them and may be struggling / identifying the wrong patterns. This can be rectified by increasing model complexity (adding layers etc.) and training epochs so that the model can learn to deal with the augmentations.
sp = SimplePreprocessor(width=32, height=32)
data_manager = DatasetManager([sp])
data_manager.load(label_folder_dict, verbose=100)
data_manager.process_data_label()
data_manager.train_valid_test_split()
birds 512 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 bottles 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 breads 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 butterfiles 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 cakes 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 cats 501 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 chickens 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 cows 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 dogs 501 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 ducks 496 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 elephants 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 fishes 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 handguns 448 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 horses 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 lions 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 lipsticks 400 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 seals 448 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 snakes 496 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 spiders 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 vases 368 Processed 100/500 Processed 200/500 Processed 300/500
from keras.utils import to_categorical
import random
def mixup(data_manager,batch_size,alpha=0.2):
l = len(data_manager.X_train)
mixed_data = []
mixed_labels = []
lam = np.random.beta(alpha,alpha)
print ("shape before mixup: X: ",data_manager.X_train.shape," y: ",data_manager.y_train.shape)
for i in range(0, l, 4):
if i + 1 < l:
x1, y1 = data_manager.X_train[i], data_manager.y_train[i]
x2, y2 = data_manager.X_train[i + 1], data_manager.y_train[i + 1]
# Generate random mixing coefficient from a beta distribution
lam = np.random.beta(alpha,alpha)
mixed_x = (lam * x1) + ((1 - lam) * x2)
mixed_y = (lam * y1) + ((1 - lam) * y2)
mixed_data.append(mixed_x)
mixed_labels.append(mixed_y)
data_manager.X_train = np.concatenate((data_manager.X_train, np.array(mixed_data)))
data_manager.y_train = np.concatenate((data_manager.y_train,np.array(mixed_labels)))
class YourModel(DefaultModel):
def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
name='network1',
width=32, height=32, depth=3,
num_classes=20,
is_augmentation = True,
activation_func='relu',
optimizer='adam',
batch_size=32,
num_epochs= 20):
super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation,
activation_func, optimizer, batch_size, num_epochs,
learning_rate, verbose)
self.num_channels = num_channels
self.mean_pool = mean_pool
self.batch_norm = batch_norm
self.use_skip = use_skip
self.blocks = blocks
def build_cnn(self,x):
#Insert your code here
x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
if self.batch_norm:
x1 = layers.BatchNormalization() (x1)
x1 = layers.Activation('relu') (x1)
x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
if self.batch_norm:
x2 = layers.BatchNormalization()(x2)
if x.shape != x2.shape:
if x2.shape[3] > x.shape[3]:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
else:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
x2_skip = layers.add([x, x2])
x2_skip = layers.Activation('relu')(x2_skip)
if self.mean_pool:
output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
else:
output_layer = x2_skip
return output_layer
def build_resnet(self):
self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
x = self.input_layer
for i in range (self.blocks):
x = self.build_cnn(x)
self.num_channels = self.num_channels*2
output_layer = GlobalAveragePooling2D()(x)
output_layer = layers.Flatten()(output_layer)
output_layer = layers.Dense(self.num_classes, activation='softmax')(output_layer)
self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
def fit(self, data_manager, batch_size=None, num_epochs=None):
#Insert your code here
batch_size = self.batch_size if batch_size is None else batch_size
num_epochs = self.num_epochs if num_epochs is None else num_epochs
self.model.compile(optimizer=self.optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_accuracy', patience=3)
callbacks = [early_stopping]
if self.is_augmentation == True:
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
width_shift_range=0.05,
zoom_range = 0.05,
)
datagen.fit(data_manager.X_train)
it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = False, batch_size =batch_size)
self.history = self.model.fit(x = it,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
else:
self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
testModel = YourModel(16,5, True, True,True, 0.001,True)
testModel.build_resnet()
mixup(data_manager,16)
testModel.fit(data_manager, batch_size = 16, num_epochs = 20)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
shape before mixup: X: (7560, 32, 32, 3) y: (7560,) Epoch 1/20
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\data\ops\structured_function.py:264: UserWarning: Even though the `tf.config.experimental_run_functions_eagerly` option is set, this option does not apply to tf.data functions. To force eager execution of tf.data functions, please use `tf.data.experimental.enable_debug_mode()`. warnings.warn(
591/591 [==============================] - 30s 51ms/step - loss: 2.6239 - accuracy: 0.1894 - val_loss: 2.4572 - val_accuracy: 0.2717 Epoch 2/20 591/591 [==============================] - 30s 50ms/step - loss: 2.3045 - accuracy: 0.2848 - val_loss: 2.2291 - val_accuracy: 0.3044 Epoch 3/20 591/591 [==============================] - 31s 52ms/step - loss: 2.1120 - accuracy: 0.3287 - val_loss: 1.8786 - val_accuracy: 0.4355 Epoch 4/20 591/591 [==============================] - 34s 57ms/step - loss: 1.9323 - accuracy: 0.3872 - val_loss: 2.2891 - val_accuracy: 0.3562 Epoch 5/20 591/591 [==============================] - 34s 57ms/step - loss: 1.7978 - accuracy: 0.4207 - val_loss: 1.7680 - val_accuracy: 0.4778 Epoch 6/20 591/591 [==============================] - 33s 56ms/step - loss: 1.6581 - accuracy: 0.4618 - val_loss: 1.7923 - val_accuracy: 0.4715 Epoch 7/20 591/591 [==============================] - 34s 58ms/step - loss: 1.5390 - accuracy: 0.4896 - val_loss: 1.9705 - val_accuracy: 0.4556 Epoch 8/20 591/591 [==============================] - 34s 58ms/step - loss: 1.3993 - accuracy: 0.5232 - val_loss: 1.7156 - val_accuracy: 0.4915 Epoch 9/20 591/591 [==============================] - 35s 59ms/step - loss: 1.2768 - accuracy: 0.5550 - val_loss: 1.7433 - val_accuracy: 0.5021 Epoch 10/20 591/591 [==============================] - 34s 58ms/step - loss: 1.1250 - accuracy: 0.5878 - val_loss: 1.7705 - val_accuracy: 0.5148 Epoch 11/20 591/591 [==============================] - 34s 57ms/step - loss: 0.9957 - accuracy: 0.6177 - val_loss: 1.8656 - val_accuracy: 0.5011 Epoch 12/20 591/591 [==============================] - 34s 58ms/step - loss: 0.8425 - accuracy: 0.6492 - val_loss: 1.8326 - val_accuracy: 0.5285 Epoch 13/20 591/591 [==============================] - 34s 58ms/step - loss: 0.7374 - accuracy: 0.6666 - val_loss: 2.0426 - val_accuracy: 0.4937 Epoch 14/20 591/591 [==============================] - 35s 59ms/step - loss: 0.6287 - accuracy: 0.6916 - val_loss: 2.2982 - val_accuracy: 0.4820 Epoch 15/20 591/591 [==============================] - 35s 59ms/step - loss: 0.5435 - accuracy: 0.7114 - val_loss: 2.0785 - val_accuracy: 0.5285 15/15 [==============================] - 0s 24ms/step - loss: 2.3560 - accuracy: 0.4831
0.4830866754055023
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 15ms/step
<Figure size 640x480 with 0 Axes>
testModel.model.save('models/data_mixup_model.h5')
**Question 3.5** Implement the one-versus-all (OVA) loss. The details are as follows:
the sigmoid activation function to logits $h = [h_1, h_2,...,h_M]$ instead of the softmax activation function as usual to obtain $p = [p_1, p_2,...,p_M]$, meaning that $p_i = sigmoid(h_i), i=1,...,M$. Note that $M$ is the number of classes.Compare the model trained with the OVA loss and the same model trained with the standard cross-entropy loss.
Sometimes bug will occur when running OVA, please restart kernel and rerun the datasetmanager method before running this if the bug occurs
The OVA loss model performs slightly worse than the standard cross-entropy loss. The accuracies can be shown below:
With Data Augmentation + Early Stopping Accuracy:
OVA Loss + With Data Augmentation + Early Stopping Accuracy:
This value seems to vary run by run as OVA loss may sometimes exceed the performance of the standard CE loss model but the overall difference seems to be small.
tf.config.run_functions_eagerly(True)
#Insert your code here. You can add more cells if necessary
class OVA_loss (tf.keras.losses.Loss):
def __init__(self,eps=1E-10, num_classes = 20):
super(OVA_loss, self).__init__()
self.eps = eps
self.num_classes = num_classes
def call(self, y_true, y_pred):
y_true_1_hot = tf.one_hot(tf.transpose(tf.cast(y_true, tf.int32), perm= [1,0])[0], depth= self.num_classes, axis=-1)
loss_true_class = -tf.math.log(y_pred + self.eps) * y_true_1_hot
# Find the index of the true class
true_class_index = tf.argmax(y_true_1_hot, axis=-1)
# Create a mask for the true class
mask = tf.one_hot(true_class_index, depth= self.num_classes, on_value=1.0, off_value=0.0)
# Invert the mask (0s become 1s, and 1s become 0s)
y_true_modified = 1 - mask
# Compute the negative log-likelihood for the other class
loss_other_class = -tf.math.log(1.0 - y_pred + self.eps) * y_true_modified
# Combine the losses for both classes
total_loss = loss_true_class + loss_other_class
return total_loss
class YourModel(DefaultModel):
def __init__(self,num_channels,blocks,mean_pool,batch_norm,use_skip,learning_rate,verbose,
name='network1',
width=32, height=32, depth=3,
num_classes=20,
is_augmentation = True,
activation_func='relu',
optimizer='adam',
batch_size=32,
num_epochs= 20):
super(YourModel, self).__init__(name, width, height, depth, num_classes, is_augmentation,
activation_func, optimizer, batch_size, num_epochs,
learning_rate, verbose)
self.num_channels = num_channels
self.mean_pool = mean_pool
self.batch_norm = batch_norm
self.use_skip = use_skip
self.blocks = blocks
def build_cnn(self,x):
#Insert your code here
x1 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x)
if self.batch_norm:
x1 = layers.BatchNormalization() (x1)
x1 = layers.Activation('relu') (x1)
x2 = layers.Conv2D(self.num_channels, (3, 3), strides=(1, 1), padding='same')(x1)
if self.batch_norm:
x2 = layers.BatchNormalization()(x2)
if x.shape != x2.shape:
if x2.shape[3] > x.shape[3]:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x2.shape[3] - x.shape[3] , 0]])
x = tf.pad(x, pad_tns, mode ='CONSTANT', constant_values=0)
else:
pad_tns = tf.constant([[0,0],[0, 0] ,[0,0], [x.shape[3] - x2.shape[3] , 0]])
x = tf.pad(x2, pad_tns, mode ='CONSTANT', constant_values=0)
x2_skip = layers.add([x, x2])
x2_skip = layers.Activation('relu')(x2_skip)
if self.mean_pool:
output_layer = layers.AveragePooling2D(pool_size=(2, 2), padding='same')(x2_skip)
else:
output_layer = x2_skip
return output_layer
def build_resnet(self):
self.input_layer = layers.Input(shape=(self.width, self.height, self.depth))
x = self.input_layer
for i in range (self.blocks):
x = self.build_cnn(x)
self.num_channels = self.num_channels*2
output_layer = GlobalAveragePooling2D()(x)
output_layer = layers.Flatten()(output_layer)
output_layer = layers.Dense(self.num_classes, activation='sigmoid')(output_layer)
self.model = tf.keras.models.Model(inputs=self.input_layer, outputs=output_layer)
self.model.compile(optimizer=self.optimizer, loss= OVA_loss(), metrics=['accuracy'])
def fit(self, data_manager, batch_size=None, num_epochs=None):
#Insert your code here
batch_size = self.batch_size if batch_size is None else batch_size
num_epochs = self.num_epochs if num_epochs is None else num_epochs
self.model.compile(optimizer=self.optimizer, loss= OVA_loss(), metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_accuracy', patience=6)
callbacks = [early_stopping]
if self.is_augmentation == True:
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
brightness_range=(0.9, 1.1),
width_shift_range=0.2,
height_shift_range=0.1,
rotation_range= 10,
)
datagen.fit(data_manager.X_train)
it = datagen.flow(data_manager.X_train,data_manager.y_train, shuffle = False, batch_size =batch_size)
self.history = self.model.fit(x = it,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
else:
self.history = self.model.fit(x = data_manager.X_train, y = data_manager.y_train,
validation_data = (data_manager.X_valid, data_manager.y_valid),
epochs = num_epochs, batch_size = batch_size, callbacks = callbacks,verbose= self.verbose)
testModel = YourModel(32,4, True, True,True, 0.001,True)
testModel.build_resnet()
mixup(data_manager,16)
testModel.fit(data_manager, batch_size = 32, num_epochs = 40)
testModel.compute_accuracy(data_manager.X_test, data_manager.y_test)
shape before mixup: X: (7560, 32, 32, 3) y: (7560,) Epoch 1/40
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\data\ops\structured_function.py:264: UserWarning: Even though the `tf.config.experimental_run_functions_eagerly` option is set, this option does not apply to tf.data functions. To force eager execution of tf.data functions, please use `tf.data.experimental.enable_debug_mode()`. warnings.warn(
355/355 [==============================] - 22s 54ms/step - loss: 0.1922 - accuracy: 0.1469 - val_loss: 0.1752 - val_accuracy: 0.2548 Epoch 2/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1721 - accuracy: 0.2139 - val_loss: 0.1638 - val_accuracy: 0.2854 Epoch 3/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1658 - accuracy: 0.2498 - val_loss: 0.1596 - val_accuracy: 0.3034 Epoch 4/40 355/355 [==============================] - 19s 53ms/step - loss: 0.1597 - accuracy: 0.2867 - val_loss: 0.1429 - val_accuracy: 0.3953 Epoch 5/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1539 - accuracy: 0.3161 - val_loss: 0.1389 - val_accuracy: 0.4228 Epoch 6/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1490 - accuracy: 0.3385 - val_loss: 0.1300 - val_accuracy: 0.4693 Epoch 7/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1441 - accuracy: 0.3597 - val_loss: 0.1359 - val_accuracy: 0.4313 Epoch 8/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1389 - accuracy: 0.3847 - val_loss: 0.1400 - val_accuracy: 0.4397 Epoch 9/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1349 - accuracy: 0.4059 - val_loss: 0.1252 - val_accuracy: 0.5032 Epoch 10/40 355/355 [==============================] - 19s 55ms/step - loss: 0.1305 - accuracy: 0.4248 - val_loss: 0.1472 - val_accuracy: 0.4207 Epoch 11/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1261 - accuracy: 0.4369 - val_loss: 0.1267 - val_accuracy: 0.5021 Epoch 12/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1214 - accuracy: 0.4577 - val_loss: 0.1312 - val_accuracy: 0.4937 Epoch 13/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1175 - accuracy: 0.4652 - val_loss: 0.1252 - val_accuracy: 0.5349 Epoch 14/40 355/355 [==============================] - 19s 55ms/step - loss: 0.1124 - accuracy: 0.4874 - val_loss: 0.1533 - val_accuracy: 0.4524 Epoch 15/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1086 - accuracy: 0.4976 - val_loss: 0.1320 - val_accuracy: 0.5275 Epoch 16/40 355/355 [==============================] - 19s 54ms/step - loss: 0.1035 - accuracy: 0.5130 - val_loss: 0.1427 - val_accuracy: 0.5116 Epoch 17/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0995 - accuracy: 0.5309 - val_loss: 0.1481 - val_accuracy: 0.4958 Epoch 18/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0948 - accuracy: 0.5362 - val_loss: 0.1305 - val_accuracy: 0.5412 Epoch 19/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0906 - accuracy: 0.5473 - val_loss: 0.1388 - val_accuracy: 0.5211 Epoch 20/40 355/355 [==============================] - 19s 55ms/step - loss: 0.0870 - accuracy: 0.5583 - val_loss: 0.1559 - val_accuracy: 0.4704 Epoch 21/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0825 - accuracy: 0.5648 - val_loss: 0.1473 - val_accuracy: 0.4894 Epoch 22/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0771 - accuracy: 0.5762 - val_loss: 0.1559 - val_accuracy: 0.4672 Epoch 23/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0734 - accuracy: 0.5833 - val_loss: 0.1557 - val_accuracy: 0.5159 Epoch 24/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0700 - accuracy: 0.5862 - val_loss: 0.1415 - val_accuracy: 0.5518 Epoch 25/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0657 - accuracy: 0.5994 - val_loss: 0.1450 - val_accuracy: 0.5476 Epoch 26/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0618 - accuracy: 0.6021 - val_loss: 0.1670 - val_accuracy: 0.4820 Epoch 27/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0593 - accuracy: 0.6057 - val_loss: 0.1708 - val_accuracy: 0.4503 Epoch 28/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0563 - accuracy: 0.6096 - val_loss: 0.1721 - val_accuracy: 0.4757 Epoch 29/40 355/355 [==============================] - 19s 55ms/step - loss: 0.0542 - accuracy: 0.6115 - val_loss: 0.1744 - val_accuracy: 0.4831 Epoch 30/40 355/355 [==============================] - 19s 54ms/step - loss: 0.0504 - accuracy: 0.6131 - val_loss: 0.1577 - val_accuracy: 0.5328 15/15 [==============================] - 0s 21ms/step - loss: 0.1532 - accuracy: 0.5338
0.5338266491889954
testModel.plot_progress()
testModel.plot_prediction(data_manager.X_test, data_manager.y_test, data_manager.classes)
30/30 [==============================] - 0s 11ms/step
<Figure size 640x480 with 0 Axes>
testModel.model.save('models/ova_model.h5')
**Question 3.6** Attack your best obtained model with PGD, MIM, and FGSM attacks with $\epsilon= 0.0313, k=20, \eta= 0.002$ on the testing set. Write the code for the attacks and report the robust accuracies. Also choose a random set of 20 clean images in the testing set and visualize the original and attacked images.
Rerun Data Manager for this part to undo data mixup
sp = SimplePreprocessor(width=32, height=32)
data_manager = DatasetManager([sp])
data_manager.load(label_folder_dict, verbose=100)
data_manager.process_data_label()
data_manager.train_valid_test_split()
birds 512 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 bottles 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 breads 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 butterfiles 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 cakes 432 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 cats 501 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 chickens 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 cows 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 dogs 501 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 ducks 496 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 elephants 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 fishes 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 handguns 448 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 horses 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 lions 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 lipsticks 400 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 seals 448 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 snakes 496 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 spiders 500 Processed 100/500 Processed 200/500 Processed 300/500 Processed 400/500 Processed 500/500 vases 368 Processed 100/500 Processed 200/500 Processed 300/500
#FGSM Attack code adapted from Tut_06b
from tensorflow.keras.models import load_model
def fgsm_attack(model, input_image, input_label=None,
epsilon=0.0313,
clip_value_min=0.,
clip_value_max=1.0,
soft_label=False,
from_logits=True):
"""
Args:
model: pretrained model
input_image: original (clean) input image (tensor)
input_label: original label (tensor, categorical representation)
epsilon: perturbation boundary
clip_value_min, clip_value_max: range of valid input
from_logits = True: attack from logits otherwise attack from prediction probabilites
Note:
we expect the output of model should be logits vector
"""
loss_fn = tf.keras.losses.sparse_categorical_crossentropy # compute CE loss from logits or prediction probabilities
if type(input_image) is np.ndarray:
input_image = tf.convert_to_tensor(input_image)
if type(input_label) is np.ndarray:
input_label = tf.convert_to_tensor(input_label)
with tf.GradientTape() as tape:
tape.watch(input_image)
output = model(input_image)
if not soft_label:
loss = loss_fn(input_label, output, from_logits=from_logits) # use ground-truth label to attack
else:
pred_label = tf.math.argmax(output, axis=1) # use predicted label to attack
loss = loss_fn(pred_label, output, from_logits=from_logits)
gradient = tape.gradient(loss, input_image) # get the gradients of the loss w.r.t. the input image
adv_image = input_image + epsilon * tf.sign(gradient) # get the final adversarial examples
adv_image = tf.clip_by_value(adv_image, clip_value_min, clip_value_max) # clip to a valid range
adv_image = tf.stop_gradient(adv_image) # stop the gradient to make the adversarial image as a constant input
return adv_image
#PGD Attack code adapted from Tut_06b
def pgd_attack(model, input_image, input_label= None,
epsilon=0.0313,
num_steps=10,
step_size=0.002,
clip_value_min=0.,
clip_value_max=1.0,
soft_label=False,
from_logits= False):
"""
Args:
model: pretrained model
input_image: original (clean) input image (tensor)
input_label: original label (tensor, categorical representation)
epsilon: perturbation boundary
num_steps: number of attack steps
step_size: size of each move in each attack step
clip_value_min, clip_value_max: range of valid input
from_logits = True: attack from logits otherwise attack from prediction probabilites
Note:
we expect the output of model should be logits vector
"""
loss_fn = tf.keras.losses.sparse_categorical_crossentropy #compute CE loss from logits or prediction probabilities
if type(input_image) is np.ndarray:
input_image = tf.convert_to_tensor(input_image)
if type(input_label) is np.ndarray:
input_label = tf.convert_to_tensor(input_label)
# random initialization around input_image
random_noise = tf.random.uniform(shape=input_image.shape, minval=-epsilon, maxval=epsilon)
adv_image = input_image + random_noise
for _ in range(num_steps):
with tf.GradientTape(watch_accessed_variables=False) as tape:
tape.watch(adv_image)
adv_output = model(adv_image)
if not soft_label:
loss = loss_fn(input_label, adv_output, from_logits= from_logits) # use ground-truth label to attack
else:
pred_label = tf.math.argmax(adv_output, axis=1)
loss = loss_fn(pred_label, adv_output, from_logits= from_logits) # use predicted label to attack
gradient = tape.gradient(loss, adv_image) # get the gradient of the loss w.r.t. the current point
adv_image = adv_image + step_size * tf.sign(gradient) # move current adverarial example along the gradient direction with step size is eta
adv_image = tf.clip_by_value(adv_image, input_image-epsilon, input_image+epsilon) # clip to a valid boundary
adv_image = tf.clip_by_value(adv_image, clip_value_min, clip_value_max) # clip to a valid range
adv_image = tf.stop_gradient(adv_image) # stop the gradient to make the adversarial image as a constant input
return adv_image
#MIM Attack code adapted from Tut_06b
def mim_attack(model, input_image, input_label= None,
epsilon=0.0313,
gamma= 0.9,
num_steps=20,
step_size=0.002,
clip_value_min=0.,
clip_value_max=1.0,
soft_label=False,
from_logits= True):
"""
Args:
model: pretrained model
input_image: original (clean) input image (tensor)
input_label: original label (tensor, categorical representation)
epsilon: perturbation boundary
gamma: momentum decay
num_steps: number of attack steps
step_size: size of each move in each attack step
clip_value_min, clip_value_max: range of valid input
from_logits = True: attack from logits otherwise attack from prediction probabilites
Note:
we expect the output of model should be logits vector
"""
loss_fn = tf.keras.losses.sparse_categorical_crossentropy # compute CE loss from logits or prediction probabilities
if type(input_image) is np.ndarray:
input_image = tf.convert_to_tensor(input_image)
if type(input_label) is np.ndarray:
input_label = tf.convert_to_tensor(input_label)
# random initialization around input_image
random_noise = tf.random.uniform(shape=input_image.shape, minval=-epsilon, maxval=epsilon)
adv_image = input_image + random_noise
adv_noise = random_noise
for _ in range(num_steps):
with tf.GradientTape(watch_accessed_variables=False) as tape:
tape.watch(adv_image)
adv_output = model(adv_image)
if not soft_label:
loss = loss_fn(input_label, adv_output, from_logits=from_logits) # use ground-truth label to attack
else:
pred_label = tf.math.argmax(adv_output, axis=1)
loss = loss_fn(pred_label, adv_output, from_logits=from_logits) # use predicted label to attack
gradient = tape.gradient(loss, adv_image) # get the gradient of the loss w.r.t. the current point
adv_image_new = adv_image + step_size * tf.sign(gradient) # move current adverarial example along the gradient direction with step size is eta
adv_image_new = tf.clip_by_value(adv_image_new, input_image-epsilon, input_image+epsilon) # clip to a valid boundary
adv_image_new = tf.clip_by_value(adv_image_new, clip_value_min, clip_value_max) # clip to a valid range
adv_noise = gamma*adv_noise + (1-gamma)*(adv_image_new - adv_image)
adv_image = adv_image_new
adv_image = tf.stop_gradient(adv_image) # stop the gradient to make the adversarial image as a constant input
adv_image = adv_image + adv_noise
adv_image = tf.clip_by_value(adv_image, input_image-epsilon, input_image+epsilon) # clip to a valid boundary
adv_image = tf.clip_by_value(adv_image, clip_value_min, clip_value_max) # clip to a valid range
return adv_image
# Load the saved model
from matplotlib import pyplot as plt
from tensorflow.keras.applications.vgg19 import preprocess_input, decode_predictions
loaded_model = load_model('models/data_mixup_model.h5')
def attack_pgd ():
random_indices = np.random.choice(len(data_manager.X_test), size=20, replace=False)
correct = 0
for i in random_indices:
x = np.expand_dims(data_manager.X_test[i], axis=0)
x = tf.cast(x, dtype=tf.float32)
x_pgd= pgd_attack(loaded_model, x, data_manager.y_test[i])
preds = loaded_model.predict(x)
pgd_pred = loaded_model.predict(x_pgd)
true_label= data_manager.classes[np.argmax(preds)]
adv_label= data_manager.classes[np.argmax(pgd_pred)]
if true_label == adv_label:
correct += 1
img = data_manager.X_test[i]
img_pgd = np.squeeze(x_pgd.numpy())
noise_pgd = np.clip(np.abs(img_pgd - img)*20, 0, 255).astype('int') # we multiply the noise by 20 for visualization
fig = plt.figure(figsize=(15, 15*3))
shown_img = img_pgd
for i in range(3):
shown_img = img if i==0 else noise_pgd if i==1 else img_pgd
shown_label = 'Original image: {}'.format(true_label) if i==0 else 'Noise' if i==1 else 'Adversarial image: {}'.format(adv_label)
plt.subplot(1,3,i+1)
plt.imshow(shown_img)
plt.xlabel(shown_label, fontsize= 12)
plt.xticks([])
plt.yticks([])
plt.grid(False)
return correct / 20 if correct > 0 else 0
attack_pgd()
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\tensorflow\python\data\ops\structured_function.py:264: UserWarning: Even though the `tf.config.experimental_run_functions_eagerly` option is set, this option does not apply to tf.data functions. To force eager execution of tf.data functions, please use `tf.data.experimental.enable_debug_mode()`. warnings.warn( Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 33ms/step 1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
0.15
def attack_fgsm ():
random_indices = np.random.choice(len(data_manager.X_test), size=20, replace=False)
correct = 0
for i in random_indices:
x = np.expand_dims(data_manager.X_test[i], axis=0)
x = tf.cast(x, dtype=tf.float32)
x_pgd= fgsm_attack(loaded_model, x, data_manager.y_test[i])
preds = loaded_model.predict(x)
pgd_pred = loaded_model.predict(x_pgd)
true_label= data_manager.classes[np.argmax(preds)]
adv_label= data_manager.classes[np.argmax(pgd_pred)]
if true_label == adv_label:
correct += 1
img = data_manager.X_test[i]
img_pgd = np.squeeze(x_pgd.numpy())
noise_pgd = np.clip(np.abs(img_pgd - img)*20, 0, 255).astype('int') # we multiply the noise by 20 for visualization
fig = plt.figure(figsize=(15, 15*3))
shown_img = img_pgd
for i in range(3):
shown_img = img if i==0 else noise_pgd if i==1 else img_pgd
shown_label = 'Original image: {}'.format(true_label) if i==0 else 'Noise' if i==1 else 'Adversarial image: {}'.format(adv_label)
plt.subplot(1,3,i+1)
plt.imshow(shown_img)
plt.xlabel(shown_label, fontsize= 12)
plt.xticks([])
plt.yticks([])
plt.grid(False)
return correct / 20 if correct > 0 else 0
attack_fgsm()
1/1 [==============================] - 0s 36ms/step 1/1 [==============================] - 0s 34ms/step
C:\Users\manut\anaconda3\envs\gpu\lib\site-packages\keras\backend.py:5582: UserWarning: "`sparse_categorical_crossentropy` received `from_logits=True`, but the `output` argument was produced by a Softmax activation and thus does not represent logits. Was this intended? output, from_logits = _get_logits( Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 32ms/step 1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 29ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 34ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 35ms/step 1/1 [==============================] - 0s 30ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 25ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 24ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 32ms/step 1/1 [==============================] - 0s 32ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
0.05
def attack_mim ():
random_indices = np.random.choice(len(data_manager.X_test), size=20, replace=False)
correct = 0
for i in random_indices:
x = np.expand_dims(data_manager.X_test[i], axis=0)
x = tf.cast(x, dtype=tf.float32)
x_pgd= fgsm_attack(loaded_model, x, data_manager.y_test[i])
preds = loaded_model.predict(x)
pgd_pred = loaded_model.predict(x_pgd)
true_label= data_manager.classes[np.argmax(preds)]
adv_label= data_manager.classes[np.argmax(pgd_pred)]
if true_label == adv_label:
correct += 1
img = data_manager.X_test[i]
img_pgd = np.squeeze(x_pgd.numpy())
noise_pgd = np.clip(np.abs(img_pgd - img)*20, 0, 255).astype('int') # we multiply the noise by 20 for visualization
fig = plt.figure(figsize=(15, 15*3))
shown_img = img_pgd
for i in range(3):
shown_img = img if i==0 else noise_pgd if i==1 else img_pgd
shown_label = 'Original image: {}'.format(true_label) if i==0 else 'Noise' if i==1 else 'Adversarial image: {}'.format(adv_label)
plt.subplot(1,3,i+1)
plt.imshow(shown_img)
plt.xlabel(shown_label, fontsize= 12)
plt.xticks([])
plt.yticks([])
plt.grid(False)
return correct / 20 if correct > 0 else 0
attack_mim()
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 33ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 29ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 30ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 30ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 31ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 27ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 25ms/step 1/1 [==============================] - 0s 26ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 28ms/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
0.25
**Question 3.7** Train a robust model using adversarial training with PGD ${\epsilon= 0.0313, k=10, \eta= 0.002}$. Write the code for the adversarial training and report the robust accuracies. After finishing the training, you need to store your best robust model in the folder ./models and load the model to evaluate the robust accuracies for PGD, MIM, and FGSM attacks with $\epsilon= 0.0313, k=20, \eta= 0.002$ on the testing set.
Accuracy before training
Accuracy after training and attacking with step size 20
As we can see, the adversarial training here has successfully reinforced our model against attacks and it is now able to give higher percentage of correct results after just 5 epochs.
from sklearn.metrics import accuracy_score
lenet_defence = loaded_model
#Insert your code here. You can add more cells if necessary
optimizer = tf.optimizers.Adam(learning_rate=0.001)
loss_obj = tf.nn.sparse_softmax_cross_entropy_with_logits
# metrics to track the different accuracies.
train_loss = tf.metrics.Mean(name='train_loss')
test_acc_clean = tf.metrics.SparseCategoricalAccuracy()
test_acc_pgd = tf.metrics.SparseCategoricalAccuracy()
batch_size = 32
def train_step_adv(x, x_adv, y):
with tf.GradientTape() as tape:
logits = lenet_defence(x)
logits_adv = lenet_defence(x_adv)
loss = (loss_obj(y, logits) + loss_obj(y, logits_adv))/2
gradients = tape.gradient(loss, lenet_defence.trainable_variables)
optimizer.apply_gradients(zip(gradients, lenet_defence.trainable_variables))
return loss
epochs = 5 # number of epochs
for epoch in range(epochs):
print ("\n")
print ("epoch: ",epoch)
# keras like display of progress
progress_bar_train = tf.keras.utils.Progbar(data_manager.X_train.shape[0], verbose=1)
for i in range (int (data_manager.X_train.shape[0]/ batch_size)):
current_data = data_manager.next_batch(batch_size)
(x,y) = current_data
x = tf.cast(x, dtype=tf.float32)
# train based on pgd attack according to the perimeters given
x_adv = pgd_attack(lenet_defence, x, y, epsilon=0.0313,
num_steps=10,
step_size=0.002,
clip_value_min=0.,
clip_value_max=1.0,
soft_label=False,
from_logits= False)
loss = train_step_adv(x, x_adv, y)
y_pred = lenet_defence(x)
test_acc_clean(y, y_pred)
test_acc_pgd(y, lenet_defence(x_adv))
train_loss(loss)
progress_bar_train.add(x.shape[0], values=[('loss', train_loss.result()), ("acc (%)", test_acc_clean.result() * 100),("pgd (%)", test_acc_pgd.result() * 100)])
print()
epoch: 0 7552/7560 [============================>.] - ETA: 0s - loss: 2.6384 - acc (%): 80.8675 - pgd (%): 17.7838 epoch: 1 7552/7560 [============================>.] - ETA: 0s - loss: 2.6217 - acc (%): 78.2272 - pgd (%): 22.1566 epoch: 2 7552/7560 [============================>.] - ETA: 0s - loss: 2.6073 - acc (%): 78.3158 - pgd (%): 24.6540 epoch: 3 7552/7560 [============================>.] - ETA: 0s - loss: 2.5977 - acc (%): 78.5100 - pgd (%): 26.2517 epoch: 4 7552/7560 [============================>.] - ETA: 0s - loss: 2.5896 - acc (%): 78.7079 - pgd (%): 27.5303
y_batch_adv = []
y_adv = []
y_true = []
for i in range(data_manager.X_test.shape[0] // batch_size):
idx = data_manager.random.choice(data_manager.X_test.shape[0], batch_size,
replace=batch_size > data_manager.X_test.shape[0])
x,y = data_manager.X_test[idx], data_manager.y_test[idx]
x_fgsm = fgsm_attack(lenet_defence, tf.cast(x, tf.float32), y, epsilon = 0.0313,
soft_label=False, clip_value_min=0.0, clip_value_max=255.0, from_logits=False)
y_batch_adv = np.argmax(loaded_model(x_fgsm).numpy(), 1)
y_adv.append(y_batch_adv[0].tolist())
y_true.append(y[0].tolist())
test_adv_acc = accuracy_score(y_true, y_adv)
print("FGSM attack accuracy:{}".format(test_adv_acc))
FGSM attack accuracy:0.3103448275862069
y_batch_adv = []
y_adv = []
y_true = []
for i in range(data_manager.X_test.shape[0] // batch_size):
idx = data_manager.random.choice(data_manager.X_test.shape[0], batch_size,
replace=batch_size > data_manager.X_test.shape[0])
x,y = data_manager.X_test[idx], data_manager.y_test[idx]
x_pgd = pgd_attack(lenet_defence, tf.cast(x, tf.float32), y, num_steps=20, step_size= 0.002, epsilon = 0.0313,
soft_label=False, clip_value_min=0.0, clip_value_max=255.0, from_logits=False)
y_batch_adv = np.argmax(loaded_model(x_pgd).numpy(), 1)
y_adv.append(y_batch_adv[0].tolist())
y_true.append(y[0].tolist())
test_adv_acc = accuracy_score(y_true, y_adv)
print("PGD attack accuracy:{}".format(test_adv_acc))
PGD attack accuracy:0.2413793103448276
y_batch_adv = []
y_adv = []
y_true = []
for i in range(data_manager.X_test.shape[0] // batch_size):
idx = data_manager.random.choice(data_manager.X_test.shape[0], batch_size,
replace=batch_size > data_manager.X_test.shape[0])
x,y = data_manager.X_test[idx], data_manager.y_test[idx]
x_mim = mim_attack(lenet_defence, tf.cast(x, tf.float32), y, num_steps=20, step_size= 0.002, epsilon = 0.0313,
soft_label=False, clip_value_min=0.0, clip_value_max=255.0, from_logits=False)
y_batch_adv = np.argmax(loaded_model(x_mim).numpy(), 1)
y_adv.append(y_batch_adv[0].tolist())
y_true.append(y[0].tolist())
test_adv_acc = accuracy_score(y_true, y_adv)
print("MIM attack accuracy:{}".format(test_adv_acc))
MIM attack accuracy:0.27586206896551724
**Question 3.8 (Kaggle competition)**
You can reuse the best model obtained in this assignment or develop new models to evaluate on the testing set of the FIT5215 Kaggle competion. However, to gain any points for this question, your testing accuracy must exceed the accuracy threshold from a base model developed by us as shown in the leader board of the competition.
The marks for this question are as follows:
**Tips and requirements**